Hello!
My variable labels do not "carry over" to the new variables after I reshape my datasets from long to wide, and I am having trouble saving, retrieving, and reapplying the variable labels to the reshaped data. My value labels are retained; it is only the variable labels that are not. I would like, for example, the variable label for each original variable ev901_ to be applied to each new variable ev901_1...ev901_n. Ideally, it would be great to have the original label plus a suffix or prefix identifying whether it is _1, _2,..._n, but I would be satisfied with the same original variable label applied to all reshaped variables. I am hoping someone here can help me trouble shoot my coding or suggest an alternative approach.
I am reshaping, in a loop, 12 datasets (each for a different country). The number of reshaped variables (the _n above) ranges from 6-11 depending on the dataset. My basic structure of my do file is as follows:
I have tried a couple of approaches, unsuccessfully, to apply the variable labels, as follows.
First, I adapted some code that I found in a response to a post (which I can no longer locate) in which someone had the same problem when reshaping their data from wide to long. What I did was add this code BEFORE the reshape command:
(I should note that at the time that I ran this code, all my variables were named as listed immediately above; I had not added the "_" suffix to them as I now do in the first code box.) Then I ran the reshape command, and then added this code AFTER the reshape process:
This appeared to work, in that Stata returned no error messages when I ran the code. However, the variable labels were not applied to the reshaped variables.
Next, I tried to adapt some code described in this FAQ: http://www.stata.com/support/faqs/da...after-reshape/
However, I had difficulty separating the code relevant to applying value labels (which I don't need) from the code for applying variable labels and abandoned my attempt.
Third, I tried a bit more of a brute force approach. I defined a program with the variable labels in it as follows:
The program is defined BEFORE the loop opens, around the place that I set the locals for the file names (cnames, IRnames, EVnames). I also added this line where I set those locals:
And this line following the forvalues statement, following the similar line for local EV:
The defined program is then run after the reshape command.
This strategy didn't work. When it is run exactly this way, I get the error "maxli not found" when the make_labels program tries to run. I tried removing "word" from the line of code reading "local maxli : word `i' of `linames'" (since I want this to be a value and not a string), but then I get the error "1 not allowed". I also commented out both the local linames "..." and the local maxli : word... lines of code and changed the program definition to read while `li'<=11{ (since 11 is the highest maxli in any of the datasets). Doing this gave me an invalid syntax error.
I tried a couple of other ways to set maxli as well. I tried:
and
and this (which is essentially the same thing)
In each of these cases, Stata balked at the if statement.
Any guidance or suggestions would be most welcome!
Thanks,
Kerry MacQuarrie
My variable labels do not "carry over" to the new variables after I reshape my datasets from long to wide, and I am having trouble saving, retrieving, and reapplying the variable labels to the reshaped data. My value labels are retained; it is only the variable labels that are not. I would like, for example, the variable label for each original variable ev901_ to be applied to each new variable ev901_1...ev901_n. Ideally, it would be great to have the original label plus a suffix or prefix identifying whether it is _1, _2,..._n, but I would be satisfied with the same original variable label applied to all reshaped variables. I am hoping someone here can help me trouble shoot my coding or suggest an alternative approach.
I am reshaping, in a loop, 12 datasets (each for a different country). The number of reshaped variables (the _n above) ranges from 6-11 depending on the dataset. My basic structure of my do file is as follows:
Code:
********************** ***** SET UP ********************** **Set local variables for country names, then file numbers of IR and EV files, local cnames "KH EG HN JO KE KY MW TJ TZ UG ZM ZW" local IRnames "72 61 62 6C 70 61 61 61 63 60 61 62" local EVnames "72 61 62 6C 70 61 61 61 63 60 61 62" **Set index local i = 0 **Set number of countries - here = 12 /*WILL NEED TO UPDATE IF ADDING GUATEMALA*/ forvalues x = 1/12 { local i = `i' + 1 ** get names of country, IR, and EV files for country number x local c : word `i' of `cnames' local IR : word `i' of `IRnames' local EV : word `i' of `EVnames' *Start from EV file use "$path`c'EV12`EV'.dta", clear ******************** ***RESHAPE: long to wide EV file ******************** *Drop unnecessary variables drop v005 v007 v008 v011 v017 v018 v019 v101 v102 v106 /// ev906-ev917 ev913a ev902a **Add suffix "_" to var stubs rename (ev* cmcclock numevents endmo startmo1 startmo obsmo obsmoend obsdur mfporno start startstate) =_ rename (evid_) (evid) **Reshape reshape wide ev004_ ev9* cmcclock_ eventorder_ evinwindow_ numevents_ endmo_ start* obs* mfporno_, i(survey v000 v001 v002 v003 caseid) j(evid) save "$path`c'EVw.dta", replace }
First, I adapted some code that I found in a response to a post (which I can no longer locate) in which someone had the same problem when reshaping their data from wide to long. What I did was add this code BEFORE the reshape command:
Code:
**Save variable labels in local macro to reapply after reshaping ds ev* survey cmcclock numevents endmo startmo1 startmo obsmo obsmoend obsdur mfporno start startstate local ev_vars `r(varlist)' foreach v of varlist `ev_vars' { local var_label_`v': var label `v' }
Code:
**Retrieve and reapply var labels foreach d of local ev_vars { foreach v of varlist `e'* { local number: subinstr local v "`e'" "" label var `v' `"`var_label_`e'' `number'"' } }
Next, I tried to adapt some code described in this FAQ: http://www.stata.com/support/faqs/da...after-reshape/
However, I had difficulty separating the code relevant to applying value labels (which I don't need) from the code for applying variable labels and abandoned my attempt.
Third, I tried a bit more of a brute force approach. I defined a program with the variable labels in it as follows:
Code:
program define make_labels local li=1 while `li'<=maxli{ lab var ev004_`li'="Event number" lab var ev900_`li'="CMC event begins" lab var ev901_`li'="CMC event ends" lab var ev901a_`li'="Duration of event" lab var ev902_`li'="Event code" lab var ev903_`li'="Discontinuation code" lab var ev904_`li'="Previous event" lab var ev905_`li'="Next event" lab var cmcclock_`li'="CMC start of observation period" lab var eventorder_`li'="timing of event relative to clock start (12mos)" lab var evinwindow_`li'="Event occurs in observation period" lab var numevents_`li'="Total number of events woman experiences in observation period" lab var endmo_`li'="Month before interview in which event ended" lab var startmo1_`li'="Month before interview in which event started" lab var startmo_`li'="Month before interview within observation period in which event started" lab var obsmo_`li'="Month in observation period in which event started" lab var obsmoend_`li'="Month in observation period in which event ended" lab var obsdur_`li'="Duration of event within observation period" lab var mfporno_`li'="Event is modern temporary contraception or not" lab var startstate_`li'="Modern contraceptive use is state at clock start (12mos)" li=`li'+1 } end
Code:
**Set local for max li value (number of episodes--evid--in datafile) for labeling vars local linames "7 10 11 10 7 9 7 9 7 9 7 6"
Code:
** get max li for country number x local maxli : word `i' of `linames'
This strategy didn't work. When it is run exactly this way, I get the error "maxli not found" when the make_labels program tries to run. I tried removing "word" from the line of code reading "local maxli : word `i' of `linames'" (since I want this to be a value and not a string), but then I get the error "1 not allowed". I also commented out both the local linames "..." and the local maxli : word... lines of code and changed the program definition to read while `li'<=11{ (since 11 is the highest maxli in any of the datasets). Doing this gave me an invalid syntax error.
I tried a couple of other ways to set maxli as well. I tried:
Code:
*Set maxli for each survey (used in make_labels program) /*Will need to UPDATE if Guatemala is added*/ if v000=="ZW"{ scalar maxli=6 } if inlist(v000,"KH","KE","MW","TZ","ZM"){ scalar maxli=7 } if inlist(v000,"KY","TJ","UG"){ scalar maxli=9 } if inlist(v000,"EG","JO"){ scalar maxli=10 } if v000=="HN" { scalar maxli=11 }
Code:
scalar maxli=6 if `c'==ZW scalar maxli=7 if `c'==KH | `c'==KE | `c'==MW | `c'==TZ | `c'==ZM scalar maxli=9 if `c'==KY | `c'==TJ | `c'==UG scalar maxli=10 if `c'==EG | `c'==JO scalar maxli=11 if `c'==HN
Code:
scalar maxli=6 if v000=="ZW" scalar maxli=7 if v000=="KH" | v000=="KE" | v000=="MW" | v000=="TZ" | v000=="ZM" scalar maxli=9 if v000=="KY" | v000=="TJ" | v000=="UG" scalar maxli=10 if v000=="EG" | v000=="JO" scalar maxli=11 if v000=="HN
Any guidance or suggestions would be most welcome!
Thanks,
Kerry MacQuarrie
Comment