Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to deal with heteroskedasticity in linear mixed models?

    Hello,

    I have a linear mixed model to test the within-subject difference in activity behaviours (eg sleep, sedentary time etc) between 2 different timepoints (school vs. holidays). The trouble is, there is heteroskedasticity in the activity behaviours coming from the timepoint variable.

    I'm wondering how I should deal with this? Can I use residuals(, by(timepoint)) to model the heteroskedasticity? Or should I use vce(robust)? Or something else?

    I've tried both but I get the same coefficient for the activity behaviour regardless of which of those syntax I run, and this doesn't change from the model without any adjustment for heteroskedasticity.

    What am I doing wrong?

    Here's what I mean:


    . mixed MARCAphysicalactivity i.time_new || wave: || schoolid: || id:, residuals(,by(time_new))

    Obtaining starting values by EM ...

    Performing gradient-based optimization:
    Iteration 0: log likelihood = -1546.7942
    Iteration 1: log likelihood = -1531.5529
    Iteration 2: log likelihood = -1530.5315
    Iteration 3: log likelihood = -1530.4198
    Iteration 4: log likelihood = -1530.4172
    Iteration 5: log likelihood = -1530.4172

    Computing standard errors ...

    Mixed-effects ML regression Number of obs = 266

    Grouping information
    -------------------------------------------------------------
    | No. of Observations per group
    Group variable | groups Minimum Average Maximum
    ----------------+--------------------------------------------
    wave | 2 124 133.0 142
    schoolid | 23 2 11.6 62
    id | 133 2 2.0 2
    -------------------------------------------------------------

    Wald chi2(1) = 3.45
    Log likelihood = -1530.4172 Prob > chi2 = 0.0632

    ---------------------------------------------------------------------------------------
    MARCAphysicalactivity | Coefficient Std. err. z P>|z| [95% conf. interval]
    ----------------------+----------------------------------------------------------------
    time_new |
    holiday | -16.9201 9.106392 -1.86 0.063 -34.7683 .9281045
    _cons | 148.2458 5.242488 28.28 0.000 137.9707 158.5209
    ---------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    Random-effects parameters | Estimate Std. err. [95% conf. interval]
    -----------------------------+------------------------------------------------
    wave: Identity |
    var(_cons) | 3.69e-12 1.11e-10 8.84e-38 1.54e+14
    -----------------------------+------------------------------------------------
    schoolid: Identity |
    var(_cons) | 6.069217 128.3406 6.07e-18 6.06e+18
    -----------------------------+------------------------------------------------
    id: Identity |
    var(_cons) | 1196.898 566.381 473.4367 3025.882
    -----------------------------+------------------------------------------------
    Residual: Independent, |
    by time_new |
    school: var(e) | 2382.838 602.2506 1451.969 3910.493
    holiday: var(e) | 8646.371 1183.862 6611.306 11307.86
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(4) = 38.32 Prob > chi2 = 0.0000

    Note: LR test is conservative and provided only for reference.



    . mixed MARCAphysicalactivity i.time_new || wave: || schoolid: || id:, vce(robust)

    Performing EM optimization ...

    Performing gradient-based optimization:
    Iteration 0: log pseudolikelihood = -1546.7942
    Iteration 1: log pseudolikelihood = -1546.4791
    Iteration 2: log pseudolikelihood = -1546.4684
    Iteration 3: log pseudolikelihood = -1546.4684

    Computing standard errors ...

    Mixed-effects regression Number of obs = 266

    Grouping information
    -------------------------------------------------------------
    | No. of Observations per group
    Group variable | groups Minimum Average Maximum
    ----------------+--------------------------------------------
    wave | 2 124 133.0 142
    schoolid | 23 2 11.6 62
    id | 133 2 2.0 2
    -------------------------------------------------------------

    Wald chi2(1) = 20.08
    Log pseudolikelihood = -1546.4684 Prob > chi2 = 0.0000

    (Std. err. adjusted for 2 clusters in wave)
    ---------------------------------------------------------------------------------------
    | Robust
    MARCAphysicalactivity | Coefficient std. err. z P>|z| [95% conf. interval]
    ----------------------+----------------------------------------------------------------
    time_new |
    holiday | -16.9201 3.776363 -4.48 0.000 -24.32163 -9.51856
    _cons | 145.6965 .4892915 297.77 0.000 144.7375 146.6555
    ---------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    | Robust
    Random-effects parameters | Estimate std. err. [95% conf. interval]
    -----------------------------+------------------------------------------------
    wave: Identity |
    var(_cons) | 1.13e-11 7.10e-09 0 .
    -----------------------------+------------------------------------------------
    schoolid: Identity |
    var(_cons) | 241.3284 215.7017 41.8597 1391.3
    -----------------------------+------------------------------------------------
    id: Identity |
    var(_cons) | 962.7075 272.0551 553.2865 1675.092
    -----------------------------+------------------------------------------------
    var(Residual) | 5514.618 1806.245 2902.116 10478.91
    ------------------------------------------------------------------------------



    . mixed MARCAphysicalactivity i.time_new || wave: || schoolid: || id:

    Performing EM optimization ...

    Performing gradient-based optimization:
    Iteration 0: log likelihood = -1546.7942
    Iteration 1: log likelihood = -1546.4791
    Iteration 2: log likelihood = -1546.4684
    Iteration 3: log likelihood = -1546.4684

    Computing standard errors ...

    Mixed-effects ML regression Number of obs = 266

    Grouping information
    -------------------------------------------------------------
    | No. of Observations per group
    Group variable | groups Minimum Average Maximum
    ----------------+--------------------------------------------
    wave | 2 124 133.0 142
    schoolid | 23 2 11.6 62
    id | 133 2 2.0 2
    -------------------------------------------------------------

    Wald chi2(1) = 3.45
    Log likelihood = -1546.4684 Prob > chi2 = 0.0632

    ---------------------------------------------------------------------------------------
    MARCAphysicalactivity | Coefficient Std. err. z P>|z| [95% conf. interval]
    ----------------------+----------------------------------------------------------------
    time_new |
    holiday | -16.9201 9.106403 -1.86 0.063 -34.76832 .9281262
    _cons | 145.6965 8.127796 17.93 0.000 129.7663 161.6267
    ---------------------------------------------------------------------------------------

    ------------------------------------------------------------------------------
    Random-effects parameters | Estimate Std. err. [95% conf. interval]
    -----------------------------+------------------------------------------------
    wave: Identity |
    var(_cons) | 1.13e-11 4.08e-10 1.94e-42 6.56e+19
    -----------------------------+------------------------------------------------
    schoolid: Identity |
    var(_cons) | 241.3284 248.5558 32.05642 1816.778
    -----------------------------+------------------------------------------------
    id: Identity |
    var(_cons) | 962.7075 589.5634 289.8794 3197.212
    -----------------------------+------------------------------------------------
    var(Residual) | 5514.618 676.2449 4336.452 7012.878
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(3) = 6.22 Prob > chi2 = 0.1014

    Note: LR test is conservative and provided only for reference.

    Thanks for your help.

  • #2
    Amanda:
    no matter the type of regression you run, options that correct for the usual nuisances have a bearing on the standard errors only, whereas the point estimate of the coefficients remains the same.
    That said, I would probably go cluster-robust standard errors, that take both heteroskedasticity and/or autocorrelation into account.
    As you can see from the following toy-example, both -robust- and -vce(cluster clusterid)- do the very same job (this feature is shared with -xtreg-, as -mixed- is the other face of the -xtreg,re mle- coin):
    Code:
    . use https://www.stata-press.com/data/r17/pig.dta
    (Longitudinal analysis of pig weights)
    
    . mixed weight i.week, robust || id:
    
    Performing EM optimization ...
    
    Performing gradient-based optimization:
    Iteration 0:   log pseudolikelihood = -1007.0675  
    Iteration 1:   log pseudolikelihood = -1007.0675  
    
    Computing standard errors ...
    
    Mixed-effects regression                        Number of obs     =        432
    Group variable: id                              Number of groups  =         48
                                                    Obs per group:
                                                                  min =          9
                                                                  avg =        9.0
                                                                  max =          9
                                                    Wald chi2(8)      =    6158.15
    Log pseudolikelihood = -1007.0675               Prob > chi2       =     0.0000
    
                                        (Std. err. adjusted for 48 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
          weight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            week |
              2  |   6.760417    .162394    41.63   0.000      6.44213    7.078703
              3  |   13.84375   .3105547    44.58   0.000     13.23507    14.45243
              4  |     19.375   .3343844    57.94   0.000     18.71962    20.03038
              5  |   25.13542   .4536934    55.40   0.000     24.24619    26.02464
              6  |   31.42708   .4655674    67.50   0.000     30.51459    32.33958
              7  |    37.4375    .554164    67.56   0.000     36.35136    38.52364
              8  |   44.28125   .6252493    70.82   0.000     43.05578    45.50672
              9  |   50.19792   .7742446    64.83   0.000     48.68043    51.71541
                 |
           _cons |   25.02083   .3563502    70.21   0.000      24.3224    25.71927
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
                                 |               Robust          
      Random-effects parameters  |   Estimate   std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Identity                 |
                      var(_cons) |   14.83704   2.756406       10.3089    21.35416
    -----------------------------+------------------------------------------------
                   var(Residual) |   4.207462   .6518382       3.10562    5.700225
    ------------------------------------------------------------------------------
    
    . mixed weight i.week, vce(cluster id) || id:
    
    Performing EM optimization ...
    
    Performing gradient-based optimization:
    Iteration 0:   log pseudolikelihood = -1007.0675  
    Iteration 1:   log pseudolikelihood = -1007.0675  
    
    Computing standard errors ...
    
    Mixed-effects regression                        Number of obs     =        432
    Group variable: id                              Number of groups  =         48
                                                    Obs per group:
                                                                  min =          9
                                                                  avg =        9.0
                                                                  max =          9
                                                    Wald chi2(8)      =    6158.15
    Log pseudolikelihood = -1007.0675               Prob > chi2       =     0.0000
    
                                        (Std. err. adjusted for 48 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
          weight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            week |
              2  |   6.760417    .162394    41.63   0.000      6.44213    7.078703
              3  |   13.84375   .3105547    44.58   0.000     13.23507    14.45243
              4  |     19.375   .3343844    57.94   0.000     18.71962    20.03038
              5  |   25.13542   .4536934    55.40   0.000     24.24619    26.02464
              6  |   31.42708   .4655674    67.50   0.000     30.51459    32.33958
              7  |    37.4375    .554164    67.56   0.000     36.35136    38.52364
              8  |   44.28125   .6252493    70.82   0.000     43.05578    45.50672
              9  |   50.19792   .7742446    64.83   0.000     48.68043    51.71541
                 |
           _cons |   25.02083   .3563502    70.21   0.000      24.3224    25.71927
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
                                 |               Robust          
      Random-effects parameters  |   Estimate   std. err.     [95% conf. interval]
    -----------------------------+------------------------------------------------
    id: Identity                 |
                      var(_cons) |   14.83704   2.756406       10.3089    21.35416
    -----------------------------+------------------------------------------------
                   var(Residual) |   4.207462   .6518382       3.10562    5.700225
    ------------------------------------------------------------------------------
    
    . xtset id week
    
    Panel variable: id (strongly balanced)
     Time variable: week, 1 to 9
             Delta: 1 unit
    
    . xtreg weight i.week, mle vce(cluster id)
    
    Fitting constant-only model:
    Iteration 0:   log likelihood = -1827.2124
    Iteration 1:   log likelihood = -1827.2118
    
    Fitting full model:
    Iteration 0:   log likelihood = -1008.0493
    Iteration 1:   log likelihood = -1007.0894
    Iteration 2:   log likelihood = -1007.0675
    Iteration 3:   log likelihood = -1007.0675
    
    Random-effects ML regression                        Number of obs    =     432
    Group variable: id                                  Number of groups =      48
    
    Random effects u_i ~ Gaussian                       Obs per group:
                                                                     min =       9
                                                                     avg =     9.0
                                                                     max =       9
    
                                                        Wald chi2(8)     = 6158.16
    Log likelihood = -1007.0675                         Prob > chi2      =  0.0000
    
                                        (Std. err. adjusted for 48 clusters in id)
    ------------------------------------------------------------------------------
                 |               Robust
          weight | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
            week |
              2  |   6.760417    .162394    41.63   0.000      6.44213    7.078703
              3  |   13.84375   .3105546    44.58   0.000     13.23507    14.45243
              4  |     19.375   .3343844    57.94   0.000     18.71962    20.03038
              5  |   25.13542   .4536933    55.40   0.000     24.24619    26.02464
              6  |   31.42708   .4655674    67.50   0.000     30.51459    32.33958
              7  |    37.4375   .5541639    67.56   0.000     36.35136    38.52364
              8  |   44.28125   .6252492    70.82   0.000     43.05578    45.50672
              9  |   50.19792   .7742445    64.83   0.000     48.68043    51.71541
                 |
           _cons |   25.02083   .3563502    70.21   0.000      24.3224    25.71927
    -------------+----------------------------------------------------------------
        /sigma_u |   3.851886   .3577988                      3.210746    4.621052
        /sigma_e |    2.05121   .1588911                      1.762277    2.387514
             rho |   .7790719   .0341462                      .7066346    .8400213
    ------------------------------------------------------------------------------
    
    .
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by amanda watson View Post
      I've tried both but I get the same coefficient for the activity behaviour regardless of which of those syntax I run, and this doesn't change from the model without any adjustment for heteroskedasticity.

      What am I doing wrong?
      It's because you're conducting a paired t-test. See below. (Begin at the "Begin here" comment.)

      Residuals don't enter into the calculation of the within-subject / before-versus-after / paired-data / fixed-effect estimator and so heteroskedasticity in them between the two time points doesn't matter.

      .ÿ
      .ÿversionÿ17.0

      .ÿ
      .ÿclearÿ*

      .ÿ
      .ÿ//ÿseedem
      .ÿsetÿseedÿ805128175

      .ÿ
      .ÿprogramÿdefineÿlrEmÿ//ÿDearÿStataCorp:ÿÿcanÿweÿpleaseÿhaveÿaÿ-noverbose-ÿoptionÿtoÿ-lrtest-?
      ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ17.0
      ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntaxÿanything
      ÿÿ3.ÿ
      .ÿÿÿÿÿÿÿÿÿquietlyÿlrtestÿ`anything'
      ÿÿ4.ÿÿÿÿÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"ÿLRÿchi2("ÿasÿresultÿr(df)ÿasÿtextÿ")ÿ=ÿ"ÿ///
      >ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿasÿresultÿ%6.2fÿr(chi2)
      ÿÿ5.ÿÿÿÿÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ"Probÿ>ÿchi2ÿ=ÿ"ÿasÿresultÿ%6.4fÿr(p)
      ÿÿ6.ÿend

      .ÿ
      .ÿ//ÿSchools
      .ÿquietlyÿsetÿobsÿ25

      .ÿgenerateÿintÿsidÿ=ÿ_n

      .ÿgenerateÿdoubleÿsid_uÿ=ÿrnormal()

      .ÿ
      .ÿ//ÿPupils
      .ÿquietlyÿexpandÿ6

      .ÿgenerateÿintÿpidÿ=ÿ_n

      .ÿgenerateÿdoubleÿpid_uÿ=ÿrnormal()

      .ÿ
      .ÿ//ÿTimeÿpoints
      .ÿdrawnormÿout0ÿout1,ÿdoubleÿmean(100ÿ100)ÿsd(2ÿ3)

      .ÿforeachÿvarÿofÿvarlistÿout?ÿ{
      ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿ`var'ÿ=ÿ`var'ÿ+ÿsid_uÿ+ÿpid_u
      ÿÿ3.ÿ}

      .ÿ
      .ÿpreserve

      .ÿquietlyÿreshapeÿlongÿout,ÿi(pid)ÿj(tim)

      .ÿ
      .ÿ*
      .ÿ*ÿBeginÿhere
      .ÿ*
      .ÿ//ÿ#1ÿHeteroskedasticÿresiduals:ÿdeltaÿ=ÿ0.2151003,ÿt(149)ÿ=ÿ0.74,ÿPÿ=ÿ0.463
      .ÿmixedÿoutÿi.timÿ||ÿsid:ÿ||ÿpid:ÿ,ÿresiduals(independent,ÿby(tim))ÿ///
      >ÿÿÿÿÿÿÿÿÿremlÿdfmethod(satterthwaite)ÿnogroupÿnolrtestÿnolog

      Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ300
      DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ29.99
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿ340.71
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿ651.44
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(1,ÿÿÿ651.44)ÿÿÿÿ=ÿÿÿÿÿÿÿ0.54
      Logÿrestricted-likelihoodÿ=ÿ-716.62066ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.4625

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.2151003ÿÿÿ.2925913ÿÿÿÿÿ0.74ÿÿÿ0.463ÿÿÿÿ-.3594355ÿÿÿÿ.7896361
      ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ99.58771ÿÿÿ.2838066ÿÿÿ350.90ÿÿÿ0.000ÿÿÿÿÿ99.00809ÿÿÿÿ100.1673
      ------------------------------------------------------------------------------

      ------------------------------------------------------------------------------
      ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -----------------------------+------------------------------------------------
      sid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.275721ÿÿÿ.5201392ÿÿÿÿÿÿ.5737233ÿÿÿÿ2.836672
      -----------------------------+------------------------------------------------
      pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.2534546ÿÿÿ.5432447ÿÿÿÿÿÿ.0037972ÿÿÿÿ16.91736
      -----------------------------+------------------------------------------------
      Residual:ÿIndependent,ÿÿÿÿÿÿÿ|
      ÿÿÿÿbyÿtimÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ0:ÿvar(e)ÿ|ÿÿÿ4.174142ÿÿÿ.7301287ÿÿÿÿÿÿ2.962631ÿÿÿÿ5.881078
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ1:ÿvar(e)ÿ|ÿÿÿ8.667306ÿÿÿ1.143494ÿÿÿÿÿÿ6.692425ÿÿÿÿ11.22496
      ------------------------------------------------------------------------------

      .ÿestimatesÿstoreÿHet

      .ÿ
      .ÿ//ÿ#2ÿHomoskedasticÿresiduals:ÿdeltaÿ=ÿ0.2151003,ÿt(149)ÿ=ÿ0.74,ÿPÿ=ÿ0.463
      .ÿmixedÿoutÿi.timÿ||ÿsid:ÿ||ÿpid:ÿ,ÿ///
      >ÿÿÿÿÿÿÿÿÿremlÿdfmethod(satterthwaite)ÿnogroupÿnolrtestÿnolog

      Mixed-effectsÿREMLÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ300
      DFÿmethod:ÿSatterthwaiteÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDF:ÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿ39.02
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿ94.01
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿ149.00
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(1,ÿÿÿ149.00)ÿÿÿÿ=ÿÿÿÿÿÿÿ0.54
      Logÿrestricted-likelihoodÿ=ÿ-724.59316ÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.4634

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.2151003ÿÿÿ.2925916ÿÿÿÿÿ0.74ÿÿÿ0.463ÿÿÿÿ-.3630646ÿÿÿÿ.7932652
      ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ99.58771ÿÿÿ.3114014ÿÿÿ319.80ÿÿÿ0.000ÿÿÿÿÿ98.95785ÿÿÿÿ100.2176
      ------------------------------------------------------------------------------

      ------------------------------------------------------------------------------
      ÿÿRandom-effectsÿparametersÿÿ|ÿÿÿEstimateÿÿÿStd.ÿerr.ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -----------------------------+------------------------------------------------
      sid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ1.313926ÿÿÿ.5502013ÿÿÿÿÿÿ.5782738ÿÿÿÿ2.985439
      -----------------------------+------------------------------------------------
      pid:ÿIdentityÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ|
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(_cons)ÿ|ÿÿÿ.2413292ÿÿÿ.5735622ÿÿÿÿÿÿ.0022886ÿÿÿÿÿ25.4474
      -----------------------------+------------------------------------------------
      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿvar(Residual)ÿ|ÿÿÿ6.420739ÿÿÿ.7438912ÿÿÿÿÿÿ5.116428ÿÿÿÿ8.057552
      ------------------------------------------------------------------------------

      .ÿlrEmÿHet
      ÿLRÿchi2(1)ÿ=ÿÿ15.95
      Probÿ>ÿchi2ÿ=ÿ0.0001

      .ÿ
      .ÿ//ÿ#3ÿMixed-modelÿANOVAÿignoringÿschoolsÿaltogether:ÿÿdeltaÿ=ÿ0.2151003,ÿt(149)ÿ=ÿ0.74,ÿPÿ=ÿ0.463
      .ÿxtregÿoutÿi.tim,ÿi(pid)ÿfe

      Fixed-effectsÿ(within)ÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ300
      Groupÿvariable:ÿpidÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿgroupsÿÿ=ÿÿÿÿÿÿÿÿ150

      R-squared:ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿObsÿperÿgroup:
      ÿÿÿÿÿWithinÿÿ=ÿ0.0036ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿminÿ=ÿÿÿÿÿÿÿÿÿÿ2
      ÿÿÿÿÿBetweenÿ=ÿÿÿÿÿÿ.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿavgÿ=ÿÿÿÿÿÿÿÿ2.0
      ÿÿÿÿÿOverallÿ=ÿ0.0015ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿmaxÿ=ÿÿÿÿÿÿÿÿÿÿ2

      ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿF(1,149)ÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ0.54
      corr(u_i,ÿXb)ÿ=ÿ0.0000ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.4634

      ------------------------------------------------------------------------------
      ÿÿÿÿÿÿÿÿÿoutÿ|ÿCoefficientÿÿStd.ÿerr.ÿÿÿÿÿÿtÿÿÿÿP>|t|ÿÿÿÿÿ[95%ÿconf.ÿinterval]
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿÿÿ1.timÿ|ÿÿÿ.2151003ÿÿÿ.2925912ÿÿÿÿÿ0.74ÿÿÿ0.463ÿÿÿÿ-.3630638ÿÿÿÿ.7932644
      ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ99.58771ÿÿÿ.2068932ÿÿÿ481.35ÿÿÿ0.000ÿÿÿÿÿ99.17889ÿÿÿÿ99.99654
      -------------+----------------------------------------------------------------
      ÿÿÿÿÿsigma_uÿ|ÿÿ2.1729107
      ÿÿÿÿÿsigma_eÿ|ÿÿ2.5339143
      ÿÿÿÿÿÿÿÿÿrhoÿ|ÿÿ.42375064ÿÿÿ(fractionÿofÿvarianceÿdueÿtoÿu_i)
      ------------------------------------------------------------------------------
      Fÿtestÿthatÿallÿu_i=0:ÿF(149,ÿ149)ÿ=ÿ1.47ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿFÿ=ÿ0.0096

      .ÿ
      .ÿrestore

      .ÿ//ÿ#3ÿPairedÿStudent'sÿt-test:ÿÿdeltaÿ=ÿ0.2151003,ÿt(149)ÿ=ÿ0.74,ÿPÿ=ÿ0.463
      .ÿttestÿout1ÿ=ÿout0

      Pairedÿtÿtest
      ------------------------------------------------------------------------------
      Variableÿ|ÿÿÿÿÿObsÿÿÿÿÿÿÿÿMeanÿÿÿÿStd.ÿerr.ÿÿÿStd.ÿdev.ÿÿÿ[95%ÿconf.ÿinterval]
      ---------+--------------------------------------------------------------------
      ÿÿÿÿout1ÿ|ÿÿÿÿÿ150ÿÿÿÿ99.80281ÿÿÿÿ.2613988ÿÿÿÿ3.201468ÿÿÿÿ99.28628ÿÿÿÿ100.3193
      ÿÿÿÿout0ÿ|ÿÿÿÿÿ150ÿÿÿÿ99.58771ÿÿÿÿ.1934667ÿÿÿÿ2.369474ÿÿÿÿ99.20542ÿÿÿÿÿÿÿ99.97
      ---------+--------------------------------------------------------------------
      ÿÿÿÿdiffÿ|ÿÿÿÿÿ150ÿÿÿÿ.2151003ÿÿÿÿ.2925912ÿÿÿÿ3.583496ÿÿÿ-.3630638ÿÿÿÿ.7932644
      ------------------------------------------------------------------------------
      ÿÿÿÿÿmean(diff)ÿ=ÿmean(out1ÿ-ÿout0)ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿtÿ=ÿÿÿ0.7352
      ÿH0:ÿmean(diff)ÿ=ÿ0ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿDegreesÿofÿfreedomÿ=ÿÿÿÿÿÿ149

      ÿHa:ÿmean(diff)ÿ<ÿ0ÿÿÿÿÿÿÿÿÿÿÿHa:ÿmean(diff)ÿ!=ÿ0ÿÿÿÿÿÿÿÿÿÿÿHa:ÿmean(diff)ÿ>ÿ0
      ÿPr(Tÿ<ÿt)ÿ=ÿ0.7683ÿÿÿÿÿÿÿÿÿPr(|T|ÿ>ÿ|t|)ÿ=ÿ0.4634ÿÿÿÿÿÿÿÿÿÿPr(Tÿ>ÿt)ÿ=ÿ0.2317

      .ÿ
      .ÿexit

      endÿofÿdo-file


      .


      In this case, I have to disagree with Carlo in that I do not recommend using robust or clustered standard errors. When you do, the standard errors are adjusted for clustering on your highest level of the hierarchy, wave, which has only two categories and whose variance has collapsed to zero, anyway. You can see the absurdity of using clustered / robust standard errors in this case in Stata's notification in the regression output that you show: "(Std. err. adjusted for 2 clusters in wave)".

      Comment


      • #4
        Joseph:
        my bad.
        Thanks for correcting me.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Thank you for taking the time to reply, I really appreciate your help.

          Just to be clear, you're saying I don't need to do anything about the heteroskedacity? Is that right?

          Comment


          • #6
            Originally posted by Carlo Lazzaro View Post
            my bad.
            No, the output in the format that it is presented in is pretty hard to pore over. The flag that something is amiss is in the dramatically different standard error and test statistic from the other two—unexpected given the nature of the contrast.

            Also, I want to emphasize that I find Carlo's advice generally unassailable; it was "In this case, . . ."

            Comment


            • #7
              Originally posted by amanda watson View Post
              Just to be clear, you're saying I don't need to do anything about the heteroskedacity? Is that right?
              Well, if the research hypothesis that you're interested in can be evaluated with a Student's t-test, then why not just do a Student's t-test? You don't have to worry about convergence failures, collapsed variance components, audience comprehension etc.

              On the other hand, if you're interested also in describing the data—estimates of variances at the various levels in the hierarchy, for example—or if you intend eventually to included additional predictors (and interaction terms) to the regression model, then it will pay to get the residual variance structure right, up front.

              Comment


              • #8
                Originally posted by Joseph Coveney View Post
                Well, if the research hypothesis that you're interested in can be evaluated with a Student's t-test, then why not just do a Student's t-test? You don't have to worry about convergence failures, collapsed variance components, audience comprehension etc.

                On the other hand, if you're interested also in describing the data—estimates of variances at the various levels in the hierarchy, for example—or if you intend eventually to included additional predictors (and interaction terms) to the regression model, then it will pay to get the residual variance structure right, up front.
                We will be adding interaction terms so I don't think t-tests will suffice.

                I see what you mean about the different standard errors when using the robust option.

                Does this mean I should use the residuals option, ie:


                . mixed MARCAphysicalactivity i.time_new || wave: || schoolid: || id:, residuals(,by(time_new))

                Obtaining starting values by EM ...

                Performing gradient-based optimization:
                Iteration 0: log likelihood = -1546.7942
                Iteration 1: log likelihood = -1531.5529
                Iteration 2: log likelihood = -1530.5315
                Iteration 3: log likelihood = -1530.4198
                Iteration 4: log likelihood = -1530.4172
                Iteration 5: log likelihood = -1530.4172

                Computing standard errors ...

                Mixed-effects ML regression Number of obs = 266

                Grouping information
                -------------------------------------------------------------
                | No. of Observations per group
                Group variable | groups Minimum Average Maximum
                ----------------+--------------------------------------------
                wave | 2 124 133.0 142
                schoolid | 23 2 11.6 62
                id | 133 2 2.0 2
                -------------------------------------------------------------

                Wald chi2(1) = 3.45
                Log likelihood = -1530.4172 Prob > chi2 = 0.0632

                ---------------------------------------------------------------------------------------
                MARCAphysicalactivity | Coefficient Std. err. z P>|z| [95% conf. interval]
                ----------------------+----------------------------------------------------------------
                time_new |
                holiday | -16.9201 9.106392 -1.86 0.063 -34.7683 .9281045
                _cons | 148.2458 5.242488 28.28 0.000 137.9707 158.5209
                ---------------------------------------------------------------------------------------

                ------------------------------------------------------------------------------
                Random-effects parameters | Estimate Std. err. [95% conf. interval]
                -----------------------------+------------------------------------------------
                wave: Identity |
                var(_cons) | 3.69e-12 1.11e-10 8.84e-38 1.54e+14
                -----------------------------+------------------------------------------------
                schoolid: Identity |
                var(_cons) | 6.069217 128.3406 6.07e-18 6.06e+18
                -----------------------------+------------------------------------------------
                id: Identity |
                var(_cons) | 1196.898 566.381 473.4367 3025.882
                -----------------------------+------------------------------------------------
                Residual: Independent, |
                by time_new |
                school: var(e) | 2382.838 602.2506 1451.969 3910.493
                holiday: var(e) | 8646.371 1183.862 6611.306 11307.86
                ------------------------------------------------------------------------------
                LR test vs. linear model: chi2(4) = 38.32 Prob > chi2 = 0.0000

                Note: LR test is conservative and provided only for reference.




                Comment

                Working...
                X