svy with mixed

Brian V. Carolan

Join Date: Apr 2014

Posts: 4
#1

svy with mixed

16 Jun 2014, 13:53

Hi,

I am using Stata SE 13.1 with Windows to run a mixed model. From what I understand, there is no way to account for design effects by using svy with these types of models.

For example, here is the code:
mixed testscore i.IV1 IV2 i.IV1#c.IV2 CV, || LEV2ID,: mle

The model includes an interaction term between categorical and continuous IVs. Could anyone kindly suggest a way to adjust for design effects (psi, stratum, weight) since svy does not work with these types of models? Thanks for your help.
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

16 Jun 2014, 19:18

Perhaps you overlooked the section on "Survey Data" in the Manual entry for mixed. Mixed does take a sampling weight [pweight=] for the first level units and a different weight for the second-level (pweight() and pwscale() options). If LEVEL2ID is the primary sampling unit (PSU), then you get robust standard errors clustered at that level. Otherwise specify vce(cluster psu_id) to get robust standard errors based on PSUs.

What is missing is a stratum specification, but that omission will only inflate standard errors. You can recover some of the lost precision by adding school characteristics as predictors the second-level model. These could include characteristics related to the sample strata.

The mle option in your code implies that you are not very interested in estimating the variance for the second-level units (otherwise reml is preferred). If that's so, then you can skip the second-level weight specification.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Brian V. Carolan

Join Date: Apr 2014

Posts: 4
#3

17 Jun 2014, 07:17

Thanks, Steve. This is very helpful. I'm aware that mixed can include Levels 1 and 2 weights (yes, I am not concerned with estimating the Level 2 variance so I am skipping the Level 2 weight). My concern was the omission of the stratum specification and its effect on the SEs. Thanks so much for the helpful suggestion. Adding school characteristics as predictors to the Level-2 model makes sense.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#4

18 Jun 2014, 15:22

You are very welcome, Brian. It's quite possible that analysis with the right set of predictors will be more precise than a stratified analysis without them.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Bill Magee

Join Date: Apr 2014

Posts: 10
#5

10 Jul 2014, 16:08

I have a related question. I am estimating mixed models using longitudinal survey data with only level-2 population weights (people are level-2 unit, who are surveyed four times.). I am using Stata 13.1. Under the survey data section of the multi-level mixed effects manual (p323) the discussion of pwscale ( ) does not explicitly address this issue. I am a bit confused about what Wi|j would be for my data. I have strongly balanced data, so could see specifying Wi|j=1 since the probability of interview is 1 given j. However the discussion of the pwscale (size vs. gk) option suggest that I might want to adopt weight of 0.25. Alternatively, if random effects are estimated at ach time period I could imagine making the level 1 weight as equal to level 2 weight. But again I am not sure about scaling.

Also, I wonder is the robust standard errors are still insufficient. I see how robust estimation deals with clustering of observations within persons, but what about the between persons standard errors (i.e., for fixed effects)? As in OLS those standard errors are not adjusted for weighting. I am thinking of jackknifing mixed. Should I have posted this as a separate question rather than responding here? Thanks
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#6

11 Jul 2014, 16:57

The theory for the two-level weights applies only if both level 1 and level 2 units were selected by sampling. In your design, the level 1 occasion units were not. Therefore you need only a single sampling weight for the level 1 units, namely the person-weight on that occasion. (In some analyses, the weights on different occasions can differ.) So, the robust survey-based standard errors still apply.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
skolenik

Join Date: Mar 2014

Posts: 100
#7

13 Jul 2014, 04:08

My understanding of weighting for mixed models would be that Bill would need to specify level 1 weights as being set to 1, and level 2 (individual) weights equal to the sampling weights he was provided with; and the level 1 weights can be scaled as pwscale(size) if needed.

Now, between-persons standard errors that control for PSUs are difficult to set up properly with the syntax that mixed provides. Ideally, I would like to see something like vce(cluster psu) , but this will not work. Specifying PSUs as the third level, however, will likely remove the possibility of specifying pwscale(). A svy jackknife may be a possibility, but it will likely be difficult to feed the replicate weights as level 2 weights of mixed. I recommended gllamm in another thread as a tool that allows both multilevel weights and vce(cluster psu) options.

-- Stas Kolenikov || http://stas.kolenikov.name
-- Principal Survey Scientist, Abt SRBI
-- Opinions stated in this post are mine only
Comment
Anneka Diedrick

Join Date: Apr 2014

Posts: 1
#8

15 Jul 2014, 05:52

Hi all, thanks for your comments on this subject. I'm wondering whether the techniques you've described also apply to mixed non-linear models, specifically negative binomial and logistic regression? The information under "Survey Data" in Stata's mixed-effects manual seems specific to linear multilevel models. I'm unfamilliar With gllamm. Can it accommodate non-linear multilevel models? Does it ran within Stata? Thanks for your input.
Comment
Bill Magee

Join Date: Apr 2014

Posts: 10
#9

21 Jul 2014, 16:21

Thanks Stas & Steve, Sorry for the delay in expressing my gratitude, I haven't been online in a while. Unfortunately for privacy reasons the group that collected the data does not release PSUs to users outside their group. They suggest jacknifing.

FYI -- the statistician on the project responded as follows (through intermediaries). It would be interesting to know what you think of this. To me it suggests that for longitudinal data the value PSU info in doing adjustments may erode over waves (time) :
"There was clustering (PSUs) in the original design, but as we follow people over time and they move, we usually ignore these in the analysis. So we treat this as a random sample with unequal probabilities of selection. Thus the "design degrees of freedom" is just the sample size.

[in Taylor Series approximations Stata does] not usually require strata and cluster variables; In any event you can create such design variables by just having the observation number be the cluster variable, and a constant (say equal to 1) be the strata.variable. For the jackknife, deleting one observations at a time to create the pseudo-population is appropriate."
Comment
Bill Magee

Join Date: Apr 2014

Posts: 10
#10

22 Jul 2014, 12:03

BTW Anneka, gllamm does run within Stata. You have to download it (type findit gllamm).
1 like
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#11

22 Jul 2014, 16:43

I'm not sure what your statistician meant by "ignoring" people who leave the cohort. If one wants an estimate of the original population's experience over all four occasions, then one would need to do compensatory weighting of the "survivors" to make them resemble the original population. Google "attrition weighting" for many resources.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment