Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Differences in standard errors between mixed and xtreg, re

    Hi everyone,

    I'm working with a balanced panel dataset of repeated observations of students' time spent reading using a reading ap.
    About 2000 students are observed each day for about 200 days. Most of the days (~95%) students spent 0 minutes reading.

    I'm interested in estimating gender differences in time spent reading as well as modeling the effects of other time invariant and time variant variables.

    I've tried doing this either using xtreg or mixed. But what I've found is that when using mixed i get much smaller standard errors for the coefficients than when using xtreg (but the point estimates are identical)
    For my main results this makes no substantial difference (i.e., the p-values do not vary in terms of conventional thresholds). But I also perform some subgroup analyses where the choice between xtreg and mixed makes the difference between a statistically significant and a non-statistically significant finding.
    When googling I only came across examples of very small differences in standard errors between estimates from xtreg and mixed.


    I've tried to recreate my problem in a mock dataset.
    Code:
    clear
    set obs 2000
    gen id = _n
    gen female = cond(mod(_n, 2), 1, 0)
    expand = 200
    sort id
    egen time = seq(), from(1) to(200) block(1)
    gen random = cond(mod(_n, 20), "No", "Yes")
    gen reading_time = 0
    replace reading_time = rnormal(0.5, 0.5) if female==0
    replace reading_time = rnormal(1.0, 0.5) if female==1
    replace reading_time=0 if random=="No"
    replace reading_time = reading_time*-1 if reading_time<0
    
    xtset id time
    xtreg reading_time i.female, re
    mixed reading_time i.female || time:
    Results from xtreg:
    Code:
    reading_time |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
        1.female |     .02103   .0006474    32.49   0.000     .0197612    .0222988
           _cons |   .0292542   .0004578    63.91   0.000      .028357    .0301514
    Results from mixed:
    Code:
    reading_time |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
        1.female |     .02103   .0003444    61.06   0.000      .020355     .021705
           _cons |   .0292542   .0122597     2.39   0.017     .0052257    .0532828
    Does anyone know why there are such marked differences? And any suggestions on choosing the most appropriate model?

    Thanks,
    Emil

  • #2
    Emil:
    what if you type:
    Code:
    xtset id time
    xtreg reading_time i.female, mle
    mixed reading_time i.female || time:
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yes - forgot to mention that I've tried different estimations methods. It does not make a difference for the difference in standard errors.

      Comment


      • #4
        Your specification is incorrect. The random effects are at the id level.

        Code:
        xtset id time
        xtreg reading_time i.female, re
        mixed reading_time i.female || id:

        Comment


        • #5
          Not only what Andrew mentions, but in your case
          Code:
          xtset id time
          does not do anything more than
          Code:
          xtset id
          does.

          That is, time is not in your xtreg regression model at all, anywhere.

          The bigger question, I think, is whether either linear model is suitable for a data-generating process where ca. 95% of your values are zero. Maybe you could look into some kind of hurdle model?

          Comment


          • #6
            Originally posted by Andrew Musau View Post
            Your specification is incorrect. The random effects are at the id level.

            Code:
            xtset id time
            xtreg reading_time i.female, re
            mixed reading_time i.female || id:
            Yes, of course - that fixes it.

            Thank you.

            And thanks, Joseph - I will look into the hurdle model

            Comment

            Working...
            X