Some scales & questionnaires have sub-scales where a sub-scale score = the sum of several items. Computing such sub-scale scores is easy if there were no missing items: One can use egen with rowtotal(varlist). E.g.,
But if some items are missing, one may wish to compute a prorated sum, provided that some minimum number of variables have valid values. For example, suppose I want to compute the prorated sum of v1-v5, but only if at least 3 of the variables have valid values.
In SPSS, I would do something like this:
Or more generally, to make it work better for long variable lists where I may not want to count the number of variables:
That 3 in MEAN.3 is the minimum number of valid values required to compute a mean.
Now don't laugh (too hard), but so far the best I've come up with in Stata is this:
There must be a more efficient way to do this, but so far my searches have been fruitless. Any tips appreciated.
Cheers,
Bruce (still a relative Stata newbie)
Code:
egen score = rowtotal(v1-v5)
Code:
input v1-v5 1 1 1 1 1 . 1 1 1 1 . . 1 1 1 . . . 1 1 . . . . 1 . . . . . end
Code:
COMPUTE ProSum = MEAN.3(v1 to v5)*5.
Code:
COMPUTE ProSum = MEAN.3(v1 to v5)*(NVALID(v1 to v5)+NMISS(v1 to v5)).
Now don't laugh (too hard), but so far the best I've come up with in Stata is this:
Code:
egen RMean = rowmean(v1-v5) // row mean egen Rmiss = rowmiss(v1-v5) // # of missing values egen Rvalid = rownonmiss(v1-v5) // # of valid values generate prosum = RMean*(Rmiss+Rvalid) if Rvalid >= 3 drop RMean - Rvalid
Cheers,
Bruce (still a relative Stata newbie)
Comment