Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assign variables labels based on values in external excel sheet

    Dear Statalisters,

    I am working with a set of raw panel data (.xls files), where sheet1 contains data on the following: country, year, and one variable-of-interest (i.e. gdp, aid received, etc.). I imported sheet1 of each raw dataset, cleaned, and merged them on the country and year variables so that they contain: country, year, and all variables-of-interest. I saved this as a tempfile: `latinamerica'. See below.
    country year gdp aid [...etc]
    Aruba 2003 1.044e+08
    Aruba 2004 -13900000
    Bahamas 1996 6.182e+09 1980000
    Bahamas 1997 6.309e+09 5100000
    Bahamas 1998 6.607e+09 34540000
    And sheet2 contains metadata, including a variable called INDICATOR_NAME whose single value is a string, which I would like to assign as the variable label to the variable-of-interest. See below.
    INDICATOR_CODE INDICATOR_NAME SOURCE_NOTE
    NY.GDP.MKTP.KD GDP (constant 2010 US$) GDP at purchaser's prices is the sum of gross value...
    I, then, imported sheet2 from each raw dataset and saved the first (and only) value of INDICATOR_NAME (i.e. "GDP (constant 2010 US$)") as a scalar. However, I’m having difficulty assigning this scalar value to the variable-of-interest (i.e. gdp).

    I’ve attached my code below. It is with the second loop, that I’m having trouble. I would appreciate any help. Thanks!

    Code:
     local variables gdp aid […etc]
    
    [lines omitted]  
    
     tempfile latinamerica
     save `latinamerica'
    
    * store variable label as a local (from original excel data)
     foreach var in `variables' {
         import excel "$data/wb_`var'", sheet(sheet2) firstrow clear
         scalar define `var'_l = INDICATOR_NAME[1]
         di `var'_l
     }
    
    * apply labels
     use `latinamerica', clear
     foreach var of varlist `variables' {
         label var `var' `=scalar(`var'_l)’
        save `latinamerica’, replace
     }

  • #2
    I am presuming that your variable labels occur in the same order, row-wise, as is the left to right order of variable names in the "variables" local, but since you didn't say, I don't know for sure. Your use of scalars here is much of what caused you problems. I can't remember seeing a situation in which I found a string scalar to be useful, as they are hard to program with, but perhaps you have seen others' uses that I'm not aware of. That being said, I think this will work, but I have not created an example to test it. Note that the FAQ all new members are asked to read indicates that you should provide example data with -dataex-. Had you done this, you'd be sure to get a solution that worked.

    Code:
    local variables gdp aid […etc]
    [lines omitted]  
    tempfile latinamerica
    save `latinamerica'
    // Put variable labels into a *sequence* of locals.  The technique that follows seems
    // a good use for a loop over observations, which otherwise is not Stata-ish.
    import excel "$data/wb_`var'", sheet(sheet2) firstrow clear
    local Nlab = _N
    forval i =1/`Nlab' {
       local vlabel`i' = INDICATOR_NAME[`i']
       di "`vlabel`i''"
    }
    // Apply the sequence of labels to the sequence of variable names
    use `latinamerica', clear
    local i 1
    foreach var of local variables {
       label var `var'  "`vlabel`i''"
       local ++i
    }
    //

    Comment


    • #3
      Thanks, Mike! To clarify, each variable (and thus each variable label) is stored in a different .xml file. And in each file, the variable (data) and variable label are located in different sheets. So I need to loop over files, not rows. Nevertheless, your response was helpful in helping me edit. This ended up working for me:

      Code:
      local variables gdp aid [...etc]
      
      [lines omitted]
      
      tempfile latinamerica
      save `latinamerica'
      
      * extract variable labels from original excel data 
        foreach var in `variables' {
            import excel "$data/wb_`var'", sheet(Metadata - Indicators) firstrow clear
            local `var'label = INDICATOR_NAME[1]
        }
      
      * apply labels to variable names
        use `latinamerica', clear
        foreach var of varlist `variables' {
             label var `var'  "``var'label'"
        }
      I'm still unsure of why I can't replace the local with a scalar though.

      Comment


      • #4
        Whoops, sorry I overlooked your feature of things in different files.

        It's possible you could use a scalar in some way. However, my advice is that scalars are not really intended to be used to construct code, but rather to hold values for display or calculation. I think the issue is that while the contents of locals can be deferenced with ` and ', scalars require `=NameOfScalar' My impression is that dereferencing of scalars so they can be used in code is a relatively recent addition to Stata (??), since one almost never sees experienced Stata users use scalars in code except to for values such as _N. (e.g., forvalues i = 1/`=_N' ...)

        Comment


        • #5
          That makes sense -- Thank you! I'm not a big fan of using scalars either, but it's nice to understand it a bit more.

          Comment

          Working...
          X