A recent item on the Wishlist for Stata 18 claims that reshape does not preserve (value) labels. The problem is not exactly clear from that post. However, a probably related problem was reported here.
According to this FAQ, in earlier versions of Stata, reshape did not preserve (value) labels. New versions of Stata should do that and, as far as I am concerned, they do.
I will use the example dataset from the linked FAQ:
Here is what I get with Stata/SE 17.0 for Windows (64-bit x86-64), update level 06 Apr 2022 (outdated) on Windows 11:
I cannot see any problem here.
Now, let's reshape long. You may or may not know that reshape stores lots of information in characteristics. The information facilities quickly change between wide and long. We will get rid of all those characteristics and pretend that we have started with the wide layout.
If I now reshape long, again, I get
I cannot find any problems with (value) labels after reshape. Perhaps, I am missing the problem; maybe the problem only occurs in a specific edition of Stata and/or on a specific OS. I encourage everyone, but especially those claiming that there is a problem, to try and replicate my results. Please state the version, edition, and update level of Stata that you are using. Also, please state the OS you are on.
According to this FAQ, in earlier versions of Stata, reshape did not preserve (value) labels. New versions of Stata should do that and, as far as I am concerned, they do.
I will use the example dataset from the linked FAQ:
Code:
clear input id year answer inc 1 80 0 5000 1 81 1 5500 1 82 0 6000 2 80 1 2000 2 81 0 2200 2 82 1 3300 3 80 0 3000 3 81 1 2000 3 82 1 1000 end label define answer 0 "Yes" 1 "No" label define year 80 "1980" 81 "1981" 82 "1982" label values answer answer label values year year label variable id "Identification" label variable year "Year of study" label variable answer "Answer to question" label variable inc "value of inc"
Here is what I get with Stata/SE 17.0 for Windows (64-bit x86-64), update level 06 Apr 2022 (outdated) on Windows 11:
Code:
. describe
Contains data
Observations: 9
Variables: 4
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id float %9.0g Identification
year float %9.0g year Year of study
answer float %9.0g answer Answer to question
inc float %9.0g value of inc
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
. label list
year:
80 1980
81 1981
82 1982
answer:
0 Yes
1 No
. reshape wide inc answer, i(id) j(year)
(j = 80 81 82)
Data Long -> Wide
-----------------------------------------------------------------------------
Number of observations 9 -> 3
Number of variables 4 -> 7
j variable (3 values) year -> (dropped)
xij variables:
inc -> inc80 inc81 inc82
answer -> answer80 answer81 answer82
-----------------------------------------------------------------------------
. describe
Contains data
Observations: 3
Variables: 7
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id float %9.0g Identification
answer80 float %9.0g answer 80 answer
inc80 float %9.0g 80 inc
answer81 float %9.0g answer 81 answer
inc81 float %9.0g 81 inc
answer82 float %9.0g answer 82 answer
inc82 float %9.0g 82 inc
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sorted by: id
. label list
answer:
0 Yes
1 No
.
end of do-file
Now, let's reshape long. You may or may not know that reshape stores lots of information in characteristics. The information facilities quickly change between wide and long. We will get rid of all those characteristics and pretend that we have started with the wide layout.
Code:
mata : st_local("reshape_chars", invtokens(st_dir("char", "_dta", "*")'))
foreach c of local reshape_chars {
char define _dta[`c'] // void
}
If I now reshape long, again, I get
Code:
. reshape long inc answer , i(id) j(year)
(j = 80 81 82)
Data Wide -> Long
-----------------------------------------------------------------------------
Number of observations 3 -> 9
Number of variables 7 -> 4
j variable (3 values) -> year
xij variables:
inc80 inc81 inc82 -> inc
answer80 answer81 answer82 -> answer
-----------------------------------------------------------------------------
. describe
Contains data
Observations: 9
Variables: 4
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
id float %9.0g Identification
year byte %10.0g
answer float %9.0g answer
inc float %9.0g
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sorted by: id year
Note: Dataset has changed since last saved.
. label list
answer:
0 Yes
1 No
.
end of do-file
I cannot find any problems with (value) labels after reshape. Perhaps, I am missing the problem; maybe the problem only occurs in a specific edition of Stata and/or on a specific OS. I encourage everyone, but especially those claiming that there is a problem, to try and replicate my results. Please state the version, edition, and update level of Stata that you are using. Also, please state the OS you are on.

Comment