Dear All,
I would like to understand if the example code is inefficient, or I am missing some subtle point.
Consider the following fragment of code shown in the "Analytic Guidelines: Creating Disability Identifiers Using the Washington Group Short Set on Functioning (WG-SS) Stata Syntax".
Variable HEAR_SS is defined to have response categories: 1,2,3,4,7,8, and 9. (The same pattern is applied by the author(s) to other variables throughout the document, so it is probably intended to be done this way).
When I look at this example, the second line (highlighted) appears redundant. By definition of how the generate command works we expect that for any value of variable HEAR_SS other than 1,2,3, and 4 we should get a dot (missing value) in the values of the generated variable Hearing. What is the rationale for including this into the guidelines?
- is it a legacy from some earlier version of Stata that behaved differently?
- is it somehow accounting for other values which may end up in the raw data files? (unlikely, I don't see how and I don't see why not write !inlist(1,2,3,4) there).
- anything else?
- or is it redundant and can be removed without any consequences?
I'd opt for
or for extra caution:
Thank you,
Sergiy Radyakin
I would like to understand if the example code is inefficient, or I am missing some subtle point.
Consider the following fragment of code shown in the "Analytic Guidelines: Creating Disability Identifiers Using the Washington Group Short Set on Functioning (WG-SS) Stata Syntax".
Code:
gen Hearing=HEAR_SS if inlist(HEAR_SS, 1, 2, 3, 4)
replace Hearing=. if inlist(HEAR_SS, 7, 8, 9)
tabulate Hearing
When I look at this example, the second line (highlighted) appears redundant. By definition of how the generate command works we expect that for any value of variable HEAR_SS other than 1,2,3, and 4 we should get a dot (missing value) in the values of the generated variable Hearing. What is the rationale for including this into the guidelines?
- is it a legacy from some earlier version of Stata that behaved differently?
- is it somehow accounting for other values which may end up in the raw data files? (unlikely, I don't see how and I don't see why not write !inlist(1,2,3,4) there).
- anything else?
- or is it redundant and can be removed without any consequences?
I'd opt for
Code:
gen byte Hearing=HEAR_SS if inlist(HEAR_SS, 1, 2, 3, 4) tabulate Hearing
Code:
assert inlist(HEAR_SS,1,2,3,4,7,8,9) gen byte Hearing=HEAR_SS if inlist(HEAR_SS, 1, 2, 3, 4) tabulate Hearing
Thank you,
Sergiy Radyakin
Comment