Hi everyone,
I'm working with a balanced panel dataset of repeated observations of students' time spent reading using a reading ap.
About 2000 students are observed each day for about 200 days. Most of the days (~95%) students spent 0 minutes reading.
I'm interested in estimating gender differences in time spent reading as well as modeling the effects of other time invariant and time variant variables.
I've tried doing this either using xtreg or mixed. But what I've found is that when using mixed i get much smaller standard errors for the coefficients than when using xtreg (but the point estimates are identical)
For my main results this makes no substantial difference (i.e., the p-values do not vary in terms of conventional thresholds). But I also perform some subgroup analyses where the choice between xtreg and mixed makes the difference between a statistically significant and a non-statistically significant finding.
When googling I only came across examples of very small differences in standard errors between estimates from xtreg and mixed.
I've tried to recreate my problem in a mock dataset.
Results from xtreg:
Results from mixed:
Does anyone know why there are such marked differences? And any suggestions on choosing the most appropriate model?
Thanks,
Emil
I'm working with a balanced panel dataset of repeated observations of students' time spent reading using a reading ap.
About 2000 students are observed each day for about 200 days. Most of the days (~95%) students spent 0 minutes reading.
I'm interested in estimating gender differences in time spent reading as well as modeling the effects of other time invariant and time variant variables.
I've tried doing this either using xtreg or mixed. But what I've found is that when using mixed i get much smaller standard errors for the coefficients than when using xtreg (but the point estimates are identical)
For my main results this makes no substantial difference (i.e., the p-values do not vary in terms of conventional thresholds). But I also perform some subgroup analyses where the choice between xtreg and mixed makes the difference between a statistically significant and a non-statistically significant finding.
When googling I only came across examples of very small differences in standard errors between estimates from xtreg and mixed.
I've tried to recreate my problem in a mock dataset.
Code:
clear set obs 2000 gen id = _n gen female = cond(mod(_n, 2), 1, 0) expand = 200 sort id egen time = seq(), from(1) to(200) block(1) gen random = cond(mod(_n, 20), "No", "Yes") gen reading_time = 0 replace reading_time = rnormal(0.5, 0.5) if female==0 replace reading_time = rnormal(1.0, 0.5) if female==1 replace reading_time=0 if random=="No" replace reading_time = reading_time*-1 if reading_time<0 xtset id time xtreg reading_time i.female, re mixed reading_time i.female || time:
Code:
reading_time | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | .02103 .0006474 32.49 0.000 .0197612 .0222988
_cons | .0292542 .0004578 63.91 0.000 .028357 .0301514
Code:
reading_time | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | .02103 .0003444 61.06 0.000 .020355 .021705
_cons | .0292542 .0122597 2.39 0.017 .0052257 .0532828
Thanks,
Emil

Comment