Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Label variables automatically

    Hi everyone,

    I would like to automatically merge values from one row in my dataset to all values in the same column.
    To illustrate, the dataset looks as following:
    Var1 Var2 Var3 Var4 Var5
    1 Male
    0 Female
    1 Yes
    0 No
    1 Agree
    0 disagree
    1 Strongyl agree
    2 Agree
    3 Disagree
    4 strongyl disagree
    1 Yes
    0 No
    0 1 1 3 1
    0 0 1 4 1
    1 1 1 1 1
    1 0 0 2 0
    Yet, I have not figured out a way to combine the labesl with the values.
    I have looked at labmask (which is still likely to be included in the solution) but I have not yet figured out a solution.

    I´m thankful for any hint.

    Best,

    Jay

  • #2
    Hello Jay,

    the data is likely coming from some data entry package like CSPro or survey management system like Survey Solutions or some other source. Perhaps it has options to export the data to Stata files directly? (Both above mentioned packages do).

    Best, Sergiy

    Comment


    • #3
      Jay, if you are importing the data from Excel, the following code might work. Note: I took this from Elan Cohen’s answer on the old Statalist archive extract labels from first row of data
      See also here and here


      Code:
      *** Remove any leading spaces
      ds, has(type string) alpha
      foreach var of varlist `r(varlist)' {
           replace `var' =  trim(`var')
      }
      
      *** If row1 has carriage returns (char(13)) or line breaks (char(10)) to get "1 Male, 0 Female" on different line within the cell
      * I replaced it with a comma, you can change to whatever you want
      ds, has(type string) alpha
      foreach var of varlist `r(varlist)' {
           replace `var' = subinstr(`var', char(10), ", ",.)
           replace `var' = subinstr(`var', char(13), ", ",.)
      }
      
      *** Label variable with contents of observation==1 (i.e. Var1[1], Var2[1], etc)
      foreach var of varlist * {
          label variable `var' "`=`var'[1]' "  // note the ` to the left of the equal sign, and to the right of [1]'
      }
      
      *** After assigning the labels, it might be necessary to delete those strings from the first row
      foreach var of varlist * {
        replace `var' = "" if _n==1
        destring  `var', replace
      }
      Last edited by David Benson; 15 Nov 2018, 12:56.

      Comment


      • #4
        Now I have a question for my own learning on this for Statalist. Since I copied part of my answer in post#3 from somebody else's code, I don't understand how one piece works (the syntax is slightly different than I would expect).

        I created the following toy dataset to test this
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str20 hhid str32 line str18 living_home str20 rev_2010
        "current household id" "most current household member id" "Living in the home" "Revenue in year 2010"
        end
        Why does this work:
        Code:
        foreach var of varlist * {
            label variable `var' "`=`var'[1]'"
        }

        And this doesn't? (I place the contents of var[1] in a local macro and then try to use the local macro as the label)
        Code:
        foreach var of varlist * {
              local var_label `var'[1]
              display `var'[1]  // This and next line just for me to make sure running correctly
              display `var_label'
              label variable `var' "`var_label'"
        }
        
        * Stata output below (so it's able to extract `var'[1] just fine)
        current household id
        current household id
        most current household member id
        most current household member id
        Living in the home
        Living in the home
        Revenue in year 2010
        Revenue in year 2010
        
        * But variable labels are incorrect
        desc
        
        ---------------------------------------------------------------------------------------------------------------------------------------------
                      storage   display    value
        variable name   type    format     label      variable label
        ---------------------------------------------------------------------------------------------------------------------------------------------
        hhid            str20   %20s                  hhid[1]
        line            str32   %32s                  line[1]
        living_home     str18   %18s                  living_home[1]
        rev_2010        str20   %20s                  rev_2010[1]
        Similarly, why does it need to be label variable `var' "`=`var'[1]' " and not label variable `var' "`var'[1]"
        What does the equal sign do in the label variable, and why do I need to ` to the left of the equal sign and ' to the right of [1]?
        Last edited by David Benson; 15 Nov 2018, 13:28.

        Comment


        • #5
          Ok, there are two sources of confusion here.

          First,

          Code:
          local var_label `var'[1]
          evaluates to

          Code:
          local var_label hhid[1]
          the first time through the loop. Thus, you are assigning the literal string hhid[1] to local macro var_label. Instead, you want to evaluate the contents of hhid[1]

          Code:
          local var_label = `var'[1]
          Second, instead of

          Code:
          display `var_label'
          you want

          Code:
          display "`var_label'"
          The former evaluates to

          Code:
          display hhid[1]
          Because display evaluates expressions, you see current household id. The latter evaluates to

          Code:
          display "hhid[1]"
          which will be displayed as a literal string.

          Best
          Daniel
          Last edited by daniel klein; 15 Nov 2018, 13:42.

          Comment


          • #6
            Thanks @daniel klein, your explanation was super helpful!

            For the sake of completeness, both methods below now work:
            Code:
            ** This works (what I had above)
            foreach var of varlist * {
                label variable `var' "`=`var'[1]'"
            }
            
            ** This also works (thanks to Daniel's explanation above)
            foreach var of varlist * {
                  local var_label = `var'[1]  // Have to do = so it takes the contents of `var'[1]
                  label variable `var' "`var_label'"
            }
            Last edited by David Benson; 15 Nov 2018, 21:48.

            Comment

            Working...
            X