Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Modeling heterogeneous group variances

    I would appreciate if the Statalist could help with my understanding of the mixed effect syntax and the implied model.

    Suppose that I have a randomly samplied group of people (-person_id-) about whom I have measured two difference characteristics (-measure- and -x-). I would like to compute the difference between the mean of two measures and model a heterogeneous variance for each separate type of measure. Clearly a t-test for independent groups is inappropriate.

    To help, I've provided a small, reproducible dataset with models that I think do what I'm asking.

    Code:
    clear
    set seed 423
    
    set obs 100
    mat def M = (4, 7)
    mat def SD = (1.5, 0.75)
    mat def R = (1, 0.2 \ 0.2, 1)
    drawnorm x1 x2, mean(M) sd(SD) corr(R)
    gen int person_id = _n
    order person_id, first
    reshape long x , i(person_id) j(measure)
    compress
    In the first model, I estimate separate means for each measurement type. The residuals are independent and computed per measurement type, but are not correlated within-subjects. The 2nd-level is added to have each person as their own cluster, but I'm not estimating a person-specific intercept since I think this is handled by the residual variance structure in this case.

    Code:
    . mixed x ibn.measure, nocons || (person_id : , nocons), resid(ind, by(measure)) reml dfmethod(kroger) 
    * output omitted
    
    Mixed-effects REML regression                   Number of obs     =        200
    Group variable: person_id                       Number of groups  =        100
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        2.0
                                                                  max =          2
    DF method: Kenward-Roger                        DF:           min =      99.00
                                                                  avg =      49.50
                                                                  max =      99.00
    
                                                    F(2,   263.30)    =    3910.89
    Log restricted-likelihood = -297.96965          Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
               x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         measure |
              1  |   3.912262   .1333588    29.34   0.000     3.647649    4.176875
              2  |   7.102253   .0850039    83.55   0.000     6.935648    7.268857
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    person_id:           (empty) |
    -----------------------------+------------------------------------------------
    Residual: Independent,       |
        by measure               |
                       1: var(e) |   1.778458   .2527791      1.346043    2.349785
                       2: var(e) |   .7225666   .1027011      .5468817    .9546901
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(1) = 19.43                 Prob > chi2 = 0.0000
    
    Note: The reported degrees of freedom assumes the null hypothesis is not on the boundary of the parameter space.  If this is not true, then the reported
          test is conservative.
    The second model is similar to the one above, except I'm allowing the measurements within-subjects to be correlated.

    Code:
    . mixed x ibn.measure, nocons || (person_id : , nocons), resid(un, t(measure)) reml dfmethod(kroger) 
    * output omitted
    
    Mixed-effects REML regression                   Number of obs     =        200
    Group variable: person_id                       Number of groups  =        100
    
                                                    Obs per group:
                                                                  min =          2
                                                                  avg =        2.0
                                                                  max =          2
    DF method: Kenward-Roger                        DF:           min =      99.00
                                                                  avg =      99.00
                                                                  max =      99.00
    
                                                    F(2,    98.00)    =    3463.37
    Log restricted-likelihood = -293.14194          Prob > F          =     0.0000
    
    ------------------------------------------------------------------------------
               x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         measure |
              1  |   3.912262   .1333588    29.34   0.000     3.647649    4.176875
              2  |   7.102253   .0850039    83.55   0.000     6.933587    7.270919
    ------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    person_id:           (empty) |
    -----------------------------+------------------------------------------------
    Residual: Unstructured       |
                         var(e1) |   1.778458   .2527791      1.346043    2.349785
                         var(e2) |   .7225667   .1027011      .5468817    .9546901
                      cov(e1,e2) |   .3455619   .1191073       .112116    .5790078
    ------------------------------------------------------------------------------
    LR test vs. linear model: chi2(2) = 29.09                 Prob > chi2 = 0.0000
    
    Note: The reported degrees of freedom assumes the null hypothesis is not on the boundary of the parameter space.  If this is not true, then the reported
          test is conservative.
    Based on these, I seem to able to recover the same group means and variance-covariance structure. So am I on the right path?
Working...
X