Hi all,
I'm running Stata 12 and noticed a strange feature about how encode assigns numeric values and labels. Here I encode some example string variables in 2 separate ways:
1) I encode the string using value labels that are identical to the original string variables
2) I encode one of the strings using a value label that is different than the original string variable
What I've noticed is that when using approach 1, the string variables are encoded in their appropriate order when viewed with labels, and when tabulating v1 without its labels the corresponding values are (1, 2, 3) and the same for v2.
However, when encoding using the approach 2, the string variables are again encoded in their appropriate order when viewed with labels, but when tabulating v2 without its labels the corresponding values are now (4, 5, 6).
Why?
For my actual data I require v2 have the corresponding values (1, 2, 3), but I do not want the attached labels to be identical to the original string variables. I suspect my problem is with how I'm using label define, but have been unsuccessful troubleshooting this seemingly trivial problem.
I'm running Stata 12 and noticed a strange feature about how encode assigns numeric values and labels. Here I encode some example string variables in 2 separate ways:
1) I encode the string using value labels that are identical to the original string variables
2) I encode one of the strings using a value label that is different than the original string variable
What I've noticed is that when using approach 1, the string variables are encoded in their appropriate order when viewed with labels, and when tabulating v1 without its labels the corresponding values are (1, 2, 3) and the same for v2.
However, when encoding using the approach 2, the string variables are again encoded in their appropriate order when viewed with labels, but when tabulating v2 without its labels the corresponding values are now (4, 5, 6).
Why?
For my actual data I require v2 have the corresponding values (1, 2, 3), but I do not want the attached labels to be identical to the original string variables. I suspect my problem is with how I'm using label define, but have been unsuccessful troubleshooting this seemingly trivial problem.
Code:
// 1) define labels same as string variables clear all inp str20(v1) "Second string" "String one" "And number three" end input str20(v2) "Third string" "Another string one" "Second again" label define order1 1 "String one" 2 "Second string" 3 "And number three" encode v1, g(_v1) label(order1) drop v1 rename _v1 v1 label define order2 1 "Another string one" 2 "Second again" 3 "Third string" encode v2, g(_v2) label(order2) drop v2 rename _v2 v2 tab v1 tab v1, nol tab v2 tab v2, nol // 2) define labels different than string variables clear all inp str20(v1) "Second string" "String one" "And number three" end input str20(v2) "Third string" "Another string one" "Second again" label define order3 1 "String one" 2 "Second string" 3 "And number three" encode v1, g(_v1) label(order3) drop v1 rename _v1 v1 label define order4 1 "One" 2 "Two" 3 "Three" encode v2, g(_v2) label(order4) drop v2 rename _v2 v2 tab v1 tab v1, nol tab v2 tab v2, nol
Comment