Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Capture var names plus labels from Excel

    Hi all,

    I've been given Excel data arranged something like this (200+ vars); can anyone see a simple way to import it, keeping Row 1 as var names; and then assigning Row 2 as var labels? Split into two 'sheets' but then what? (test Excel attached).
    TT1 TT2 CODE CODE1 CODE2
    Num Province Name Province Code District District Code
    1 a 11 f 16
    2 b 12 g 17
    3 c 13 h 18
    4 d 14 i 19
    5 e 15 j 20
    cheers
    Andrew
    Attached Files
    __________________________________________________ __
    Assistant Professor, Department of Biostatistics and Epidemiology
    School of Public Health and Health Sciences
    University of Massachusetts- Amherst

  • #2
    This came up a couple of times before, but it seems hard to find the threads. Here is one from the old listserver.

    The idea is easy enough. Import the spreadsheet using import excel with firstrow option. Then loop over all variables assigning varname[1] as the variable label to varname. Last, drop in 1, to get rid of the first row. Probably you also need to destring all the variables holding numeric values but are stored as strings (due to the first observation being a label).

    Best
    Daniel

    Comment


    • #3
      Thanks Daniel, that's exactly what I needed.
      __________________________________________________ __
      Assistant Professor, Department of Biostatistics and Epidemiology
      School of Public Health and Health Sciences
      University of Massachusetts- Amherst

      Comment


      • #4
        Thanks also from me.
        Anyone feeling generous enough to show me a simpler way to do this (eg 3 vars)?
        foreach k in A1 A2 A3 {
        local x=`k'[1]
        rename `k' `x'
        }
        drop in 1
        Regards
        Laurence

        Comment


        • #5
          What do you mean by "simpler"?

          If you have variable names (not labels) in the first row of your Excel spreadsheet, specify the firstrow option with your import excel call. This is documented. If you mean something else, please elaborate on what you are looking for.

          Best
          Daniel

          Comment


          • #6
            Based on Daniel suggestion from #2 the following code will do the work:

            Code:
            ​clear*
            import excel label_test.xlsx, sheet("Sheet1") firstrow
            foreach v of var TT1-CODE2 {
            la var `v' "`=`v'[1]'"
            replace `v'="" in 1
             }
            drop in 1
            destring _all, replace
            Last edited by Oded Mcdossi; 11 May 2015, 01:30.

            Comment


            • #7
              Originally posted by Oded Mcdossi View Post
              Based on Daniel suggestion from #2 the following code will do the work:

              Code:
              ​clear*
              import excel label_test.xlsx, sheet("Sheet1") firstrow
              foreach v of var TT1-CODE2 {
              la var `v' "`=`v'[1]'"
              replace `v'="" in 1
              }
              drop in 1
              destring _all, replace
              I tried this code for my data set but got an error (type mismatch) for labels that are missing:

              foreach v of varlist _all {
              la var `v' "`=`v'[1]'"
              replace `v'="" in 1
              }
              drop in 1

              Here is my output:

              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              (1 real change made)
              type mismatch
              r(109);

              end of do-file

              Comment


              • #8
                Apparently whatever is the 12th variable in the data set is not a string, hence the type mismatch message. That could happen if the "label" information in the first observation for that variable has all numeric content, or perhaps it is all missing values.

                In any case, the -replace `v' = "" in 1- command is completely unnecessary since the next step after the loop is to drop the first observation. So you may as well remove that command and just leave that first observation as you found it, given that you will be dropping it anyway.

                Comment

                Working...
                X