A recent item on the Wishlist for Stata 18 claims that reshape does not preserve (value) labels. The problem is not exactly clear from that post. However, a probably related problem was reported here.
According to this FAQ, in earlier versions of Stata, reshape did not preserve (value) labels. New versions of Stata should do that and, as far as I am concerned, they do.
I will use the example dataset from the linked FAQ:
Here is what I get with Stata/SE 17.0 for Windows (64-bit x86-64), update level 06 Apr 2022 (outdated) on Windows 11:
I cannot see any problem here.
Now, let's reshape long. You may or may not know that reshape stores lots of information in characteristics. The information facilities quickly change between wide and long. We will get rid of all those characteristics and pretend that we have started with the wide layout.
If I now reshape long, again, I get
I cannot find any problems with (value) labels after reshape. Perhaps, I am missing the problem; maybe the problem only occurs in a specific edition of Stata and/or on a specific OS. I encourage everyone, but especially those claiming that there is a problem, to try and replicate my results. Please state the version, edition, and update level of Stata that you are using. Also, please state the OS you are on.
According to this FAQ, in earlier versions of Stata, reshape did not preserve (value) labels. New versions of Stata should do that and, as far as I am concerned, they do.
I will use the example dataset from the linked FAQ:
Code:
clear input id year answer inc 1 80 0 5000 1 81 1 5500 1 82 0 6000 2 80 1 2000 2 81 0 2200 2 82 1 3300 3 80 0 3000 3 81 1 2000 3 82 1 1000 end label define answer 0 "Yes" 1 "No" label define year 80 "1980" 81 "1981" 82 "1982" label values answer answer label values year year label variable id "Identification" label variable year "Year of study" label variable answer "Answer to question" label variable inc "value of inc"
Here is what I get with Stata/SE 17.0 for Windows (64-bit x86-64), update level 06 Apr 2022 (outdated) on Windows 11:
Code:
. describe Contains data Observations: 9 Variables: 4 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Variable Storage Display Value name type format label Variable label ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id float %9.0g Identification year float %9.0g year Year of study answer float %9.0g answer Answer to question inc float %9.0g value of inc ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sorted by: Note: Dataset has changed since last saved. . label list year: 80 1980 81 1981 82 1982 answer: 0 Yes 1 No . reshape wide inc answer, i(id) j(year) (j = 80 81 82) Data Long -> Wide ----------------------------------------------------------------------------- Number of observations 9 -> 3 Number of variables 4 -> 7 j variable (3 values) year -> (dropped) xij variables: inc -> inc80 inc81 inc82 answer -> answer80 answer81 answer82 ----------------------------------------------------------------------------- . describe Contains data Observations: 3 Variables: 7 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Variable Storage Display Value name type format label Variable label ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id float %9.0g Identification answer80 float %9.0g answer 80 answer inc80 float %9.0g 80 inc answer81 float %9.0g answer 81 answer inc81 float %9.0g 81 inc answer82 float %9.0g answer 82 answer inc82 float %9.0g 82 inc ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sorted by: id . label list answer: 0 Yes 1 No . end of do-file
Now, let's reshape long. You may or may not know that reshape stores lots of information in characteristics. The information facilities quickly change between wide and long. We will get rid of all those characteristics and pretend that we have started with the wide layout.
Code:
mata : st_local("reshape_chars", invtokens(st_dir("char", "_dta", "*")')) foreach c of local reshape_chars { char define _dta[`c'] // void }
If I now reshape long, again, I get
Code:
. reshape long inc answer , i(id) j(year) (j = 80 81 82) Data Wide -> Long ----------------------------------------------------------------------------- Number of observations 3 -> 9 Number of variables 7 -> 4 j variable (3 values) -> year xij variables: inc80 inc81 inc82 -> inc answer80 answer81 answer82 -> answer ----------------------------------------------------------------------------- . describe Contains data Observations: 9 Variables: 4 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Variable Storage Display Value name type format label Variable label ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- id float %9.0g Identification year byte %10.0g answer float %9.0g answer inc float %9.0g ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sorted by: id year Note: Dataset has changed since last saved. . label list answer: 0 Yes 1 No . end of do-file
I cannot find any problems with (value) labels after reshape. Perhaps, I am missing the problem; maybe the problem only occurs in a specific edition of Stata and/or on a specific OS. I encourage everyone, but especially those claiming that there is a problem, to try and replicate my results. Please state the version, edition, and update level of Stata that you are using. Also, please state the OS you are on.
Comment