Interpreting random effects: can statistical significance of variance components be determined?

Marie Eisenstein

Join Date: Sep 2016
Posts: 39

Interpreting random effects: can statistical significance of variance components be determined?

07 Nov 2016, 09:18

I use Stata 14.1 on Windows 10

Using this code:

Code:

mixed tolind c.age##c.age Inspired Fables female Liberal Moderate white ceduc marry kids south urban rural suburban crelcom mainline blackprot catholic nofaith tolindsd [fweight=wtssall]|| _all:R.cohort5 || year:

I get these results:

HTML Code:

-------------------------------------------------------------
                |     No. of       Observations per Group
 Group Variable |     Groups    Minimum    Average    Maximum
----------------+--------------------------------------------
           _all |          1     14,604   14,604.0     14,604
           year |         18        348      810.8      1,733
-------------------------------------------------------------

                                                Wald chi2(21)     =    8042.40
Log likelihood =  -39253.37                     Prob > chi2       =     0.0000

------------------------------------------------------------------------------
      tolind |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0024984   .0133355    -0.19   0.851    -.0286355    .0236387
             |
 c.age#c.age |  -.0003943   .0001274    -3.10   0.002    -.0006439   -.0001446
             |
    Inspired |   1.680803    .072354    23.23   0.000     1.538992    1.822614
      Fables |   1.882974   .1039909    18.11   0.000     1.679155    2.086792
      female |   .1422362   .0607907     2.34   0.019     .0230886    .2613838
     Liberal |   .2627643   .0785938     3.34   0.001     .1087233    .4168053
    Moderate |   .1032633   .0708411     1.46   0.145    -.0355828    .2421094
       white |   .6327955   .0920592     6.87   0.000     .4523628    .8132283
       ceduc |   .1892372   .0084649    22.36   0.000     .1726463    .2058281
       marry |  -.0779974   .0683676    -1.14   0.254    -.2119955    .0560006
        kids |  -.2239905   .0805381    -2.78   0.005    -.3818422   -.0661387
       south |  -.5834785   .0644605    -9.05   0.000    -.7098187   -.4571383
       urban |   .0869833   .0832515     1.04   0.296    -.0761867    .2501532
       rural |  -.7469693   .1015166    -7.36   0.000    -.9459382   -.5480003
    suburban |   .3433786   .0730651     4.70   0.000     .2001736    .4865836
     crelcom |  -.0815977   .0096197    -8.48   0.000     -.100452   -.0627435
    mainline |   .4905731   .0900192     5.45   0.000     .3141387    .6670074
   blackprot |   .0190671   .1368865     0.14   0.889    -.2492256    .2873598
    catholic |   .2306323   .0786859     2.93   0.003     .0764108    .3848538
     nofaith |   .4159865   .1095781     3.80   0.000     .2012174    .6307556
    tolindsd |  -8.500621   .1499065   -56.71   0.000    -8.794432    -8.20681
       _cons |   11.76895   .3746199    31.42   0.000     11.03471    12.50319
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
_all: Identity               |
              var(R.cohort5) |   .1133522   .0581664      .0414609    .3099001
-----------------------------+------------------------------------------------
year: Identity               |
                  var(_cons) |   .3264224   .1205952      .1582368    .6733679
-----------------------------+------------------------------------------------
               var(Residual) |   12.57375   .1473513      12.28824    12.86589
------------------------------------------------------------------------------
LR test vs. linear model: chi2(2) = 0.00                  Prob > chi2 = 1.0000

Note: LR test is conservative and provided only for reference.

My question has to do with the random-effects parameters, specifically, can one say that the variance components (.113 and/or .326) are statistically significant? In some MLM studies, I see these variance components listed as significant, in others, I see no mention of it. I thought that the LR test was a test comparing two models using their log likelihoods - but it is not clear to me that this says anything about whether or not the variance components can be listed as statistically significant or not.

Help is appreciated.
Thanks,
Marie

Tags: None

Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#2

07 Nov 2016, 15:53

I think I figured it out. I believe I can determine the statistical significance of the variance components by dividing the variance (estimate) by the standard deviation (producing a number, say, X. And then use the follow code to obtain a p-value.

Code:

di 2*(1 - normal(abs(X)))
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3459
#3

08 Nov 2016, 01:48

That is not quite correct. The problem is that your null hypothesis is "on the boundary of the parameter space": A variance cannot be negative, so the possible values a variance can take are 0 or more. Your null hypothesis is that the variance is 0, so it is on the boundary of the possible values a variance could take. When we do testing we look at the sampling distribution of the parameter, that is, we imagine that we repeatedly draw a sample from the population, estimate our model in each of these samples and collect the parameters. These parameters will differ from sample to sample, and the distribution of these parameters is the sampling distribution. We obviously have only one sample from the population, but we know quite a bit about how sampling works and we can use that to compute an approximation. When you estimate a model with maximum likelihood (as you did when you typed mixed) then such an approximation that will work most of the time is that the sampling distribution is a Gaussian/normal distribution. That is where your formula came from. However, the Gaussian distribution assumes that all positive and negative values are possible, which is not the case for a variance. That normal distribution is an approximation, so it does not have to be exactly true, but you null hypothesis assumes the worst possible case: the population parameter is as close to the impossible values as possible. This is where that approximation breaks down.

Also see help j_chibar or (Gutierrez, Carter, Drukker 2001).

Roberto G. Gutierrez, Shana Carter, and David M. Drukker (2001) "On boundary-value likelihood-ratio tests" Stata Technical Bulletin, 60: 15-18. http://www.stata.com/products/stb/journals/stb60.pdf

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#4

08 Nov 2016, 09:24

Maarten,
Thank you for this feedback. Essentially, whatever p-value I get from the code I referenced: di 2*(1 - normal(abs(X)))
is invalid.

This still leaves me stumped as to how I can get Stata to provide me with a valid calculation for the statistical significance of my period and cohort variances.
Marie
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#5

08 Nov 2016, 11:59

I'm confused: isn't what you want in "r(table)"; after estimating your model, type "mat li r(table)"
Comment
Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#6

08 Nov 2016, 12:10

Hi Rich - I typed in "mat li r(table)" but (unless I missed it), that does not provide me with the statistical significance of my random effects parameter estimates.
Comment
Rich Goldstein

Join Date: Mar 2014

Posts: 4466
#7

08 Nov 2016, 13:00

if you show your results (within CODE blocks please so it will be legible - see the FAQ for instructions), I can point to what I am talking about and you can decide whether it is what you mean; it will be easier, I think, if you use the "stddeviations" option on your command line
Comment

Marie Eisenstein

Join Date: Sep 2016
Posts: 39

08 Nov 2016, 22:46

Rich,
Thanks for your help. Below are my results (sorry for the delay, I was on the radio discussing election results).

HTML Code:

. mat li r(table)

r(table)[9,24]
            tolind:     tolind:     tolind:     tolind:     tolind:     tolind:     tolind:     tolind:
                         c.age#                                                                        
               age       c.age    Inspired      Fables      female     Liberal    Moderate       white
     b    .0403631  -.00080222   1.8998337   2.4372588   .02386721   .45475009   .05881294   1.0598132
    se   .01490936   .00014156   .07980834   .11436071   .06710728   .08673279   .07824571   .10134008
     z   2.7072317   -5.666973   23.804954   21.312029   .35565751   5.2431161    .7516443   10.457987
pvalue   .00678469   1.453e-08   2.97e-125   8.78e-101   .72209707   1.579e-07     .452265   1.347e-25
    ll   .01114129  -.00107968   1.7434123   2.2131159  -.10766064   .28475694  -.09454583   .86119034
    ul   .06958492  -.00052477   2.0562552   2.6614016   .15539505   .62474323   .21217171   1.2584362
    df           .           .           .           .           .           .           .           .
  crit    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964
 eform           0           0           0           0           0           0           0           0

            tolind:     tolind:     tolind:     tolind:     tolind:     tolind:     tolind:     tolind:
                                                                                                       
             ceduc       marry        kids       south       urban       rural    suburban     crelcom
     b   .24286389  -.06616891  -.38847293  -.58764655   .01769834  -.85394901   .30845838  -.07799438
    se   .00929203   .07551756   .08890605   .07120256   .09194642   .11211284   .08070164   .01062578
     z   26.136804  -.87620572  -4.3694771  -8.2531663   .19248534  -7.6168706    3.822207  -7.3401114
pvalue   1.39e-150   .38091821   .00001245   1.543e-16   .84736205   2.599e-14   .00013226   2.134e-13
    ll   .22465185   -.2141806  -.56272558    -.727201  -.16251334  -1.0736861   .15028606  -.09882051
    ul   .26107593   .08184278  -.21422028   -.4480921   .19791002  -.63421188   .46663069  -.05716824
    df           .           .           .           .           .           .           .           .
  crit    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964
 eform           0           0           0           0           0           0           0           0

            tolind:     tolind:     tolind:     tolind:     tolind:   lns1_1_1:   lns2_1_1:    lnsig_e:
                                                                                                       
          mainline   blackprot    catholic     nofaith       _cons       _cons       _cons       _cons
     b   .46705153   .08739747   .01724263   .60193139   7.7892717  -.86818339  -.53699042   1.3652601
    se   .09943144   .15119415   .08681632   .12098629     .415639   .23282169   .19097622   .00585857
     z    4.697222   .57804796   .19861043   4.9752034   18.740473  -3.7289627  -2.8118182   233.03633
pvalue   2.637e-06   .56323174   .84256749   6.518e-07   2.316e-78   .00019227   .00492623           0
    ll    .2721695  -.20893761  -.15291424   .36480262   6.9746342  -1.3245055  -.91129694   1.3537775
    ul   .66193357   .38373255   .18739949   .83906016   8.6039092  -.41186127   -.1626839   1.3767427
    df           .           .           .           .           .           .           .           .
  crit    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964    1.959964
 eform           0           0           0           0           0           0           0           0

My apologies, but I am not sure how you wanted me to use the "stddeviations" option.
Thanks,
Marie

Comment

Maarten Buis

Join Date: Mar 2014

Posts: 3459
#9

09 Nov 2016, 01:20

r(table) is not the solution here. It contains the same problems as the earlier solution of using di 2*(normal(-abs(b/se ))) as discussed in #3. For example, if you look at lns2_1_1 we see that it uses the exact same formula:

Code:

. di 2*(normal(-abs(-.53699042/.19097622 ))) .00492623

That is not surprising, r(table) is a generic matrix created for many different commands. Only inside the actual command do specific commands like mixed filter out the exceptions and choose not to display p-values that are wrong.

Last edited by Maarten Buis; 09 Nov 2016, 01:24.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#10

09 Nov 2016, 07:06

Maarten, I see.......the r(table) provides me with the exact same results as the di 2*(normal(-abs(b/se ))). How do I determine accurate p-values for my variance components? I have looked at the documentation you suggested previously, and my apologies that my statistical skills are severely lacking, but I have not been able to figure out how to do it.
Marie
Comment
Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#11

09 Nov 2016, 10:59

Or......is my question a fool's errand? Is it simply not possible to calculate a valid statistical significance for variance components and thus, that is why the "mixed" command does not provide p-value's for those coefficients?
Comment

Joseph Coveney

Join Date: Apr 2014
Posts: 4421

#12

09 Nov 2016, 16:46

Originally posted by Marie Eisenstein View Post

Or......is my question a fool's errand? Is it simply not possible to calculate a valid statistical significance for variance components and thus, that is why the "mixed" command does not provide p-value's for those coefficients?

Try the following:

Code:

local predictors c.age##c.age Inspired Fables female Liberal Moderate ///
    white ceduc marry kids south urban rural suburban ///
    crelcom mainline blackprot catholic nofaith tolindsd

mixed tolind `predictors' [fweight=wtssall] || _all:R.cohort5 || year:

*
* Begin here
*
estimates store Both

// Variance component for cohort5
quietly mixed tolind `predictors' [fweight=wtssall] || year:
lrtest Both .
display in smcl as text r(p) / 2

// Variance component for year
mixed tolind `predictors' [fweight=wtssall] || _all:R.cohort5
lrtest Both .
display in smcl as text r(p) / 2

Why isn't i.year in your list of explanatory variables (fixed effects)?

Comment

Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#13

09 Nov 2016, 19:38

Joseph, I have not yet tried the code, will do so shortly. Thank you.

Are you asking why I have not included year as both a fixed and a random effect? I am most interested in the societal level effect of cohort and period (year), thus I did not think year belonged as a fixed effect.
Comment
Joseph Coveney

Join Date: Apr 2014

Posts: 4421
#14

09 Nov 2016, 22:24

Yes, I was asking why you didn't included year in your fixed effects equation in addition to the random effects equation. I'm afraid that I don't understand your rationale.
Comment
Marie Eisenstein

Join Date: Sep 2016

Posts: 39
#15

10 Nov 2016, 09:14

Hi Joseph, I do not think year (or period) is an individual--level effec influence, thus I didn't include it as a fixed effect. Okay, I tried your code. Below is how I used your code (or at least how I thought it should be used) and the results.
Code:

Code:

mixed "predictors" [fweight=wtssall] || _all:R.cohort5 || year: estimates store Both. quietly mixed "predictors" [fweight=wtssall] || year: lrtest Both

Results:

HTML Code:

lrtest Both observations differ: 14604.835 vs. 14636.068 r(498);

Ugh.
Comment

Announcement

Interpreting random effects: can statistical significance of variance components be determined?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment