Problem with LR test in a multilevel model

Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#1

Problem with LR test in a multilevel model

20 Mar 2019, 12:28

Dear Statalist,

I am trying to figure out if it is necessary to include a random intercept in a three level model (time nested in firms nested in sectors). First, I estimate a model without random intercepts and compare it with a model with random intercept (and random coefficient of a time-varying firm level variable) for firms. Thereafter, I compare the latter model with another that also include a random intercept for sectors. The problem is that when trying to do the LR test for the two last models, the test assume that sector-level model is nested in firm-level model (which cannot be), and thus, do not give the test (see below).
Is there anything wrong in my approach?
I think the problem has to do with the vce(robust) error. Can I perform the same but without robust errors (even though the reported results will be for the model with robust error but using the LR test for the no robust model to check for the inclusion of random intercept)?

Thanks a lot in advance.

Code:

. . lrtest logit logit_f, force Likelihood-ratio test LR chi2(3) = 7938.01 (Assumption: logit nested in logit_f) Prob > chi2 = 0.0000 . local Likelihood_ratio_test1 = r(chi2) . local pvalue_Likelihood_ratio_test1 = r(p) . . lrtest logit_f logit_s, force Likelihood-ratio test LR chi2(0) = -186.77 (Assumption: logit_s nested in logit_f) Prob > chi2 = .

Look when I do not use vce(robust) there is not problem with the tests

Code:

lrtest logit logit_f, force Likelihood-ratio test LR chi2(3) = 7938.01 (Assumption: logit nested in logit_f) Prob > chi2 = 0.0000 . local Likelihood_ratio_test1 = r(chi2) . local pvalue_Likelihood_ratio_test1 = r(p) . . lrtest logit_f logit_s, force Likelihood-ratio test LR chi2(1) = 186.77 (Assumption: logit_f nested in logit_s) Prob > chi2 = 0.0000
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#2

20 Mar 2019, 13:01

Likelihood ratio tests are only applicable to nested models. If your models are not nested, then you cannot use an LR test to contrast them.

But, you say

Thereafter, I compare the latter model with another that also include a random intercept for sectors. [emphasis added]

If I understand that sentence correctly, your sector model is the same as the firm-level model, except that you have added a random intercept for sectors to it and it is otherwise the same. If that is the case, then the firm-level model is, in fact, nested in the sector-level model and you can feel free to make use of the LR test.
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#3

20 Mar 2019, 14:08

Dear Clyde, thanks for your response. In fact, I am comparing a model (with the same fixed part) and with random intercept and coefficient for the firm, with respect to the same model but with also a random intercept for sector. My point is that when both models are estimated with vce(robust), the LR test assumes that is the sector-level model the one nested in the firm-level model (which cannot be because the sector-model has an extra parameter) and report -186.77 in the test as you can see above. Besides, when both models are estimated without vce(robust), the test is correct giving 186.77 with the correct assumption (see also above)
My question is: can I report the model with robust error, but using the estimation without the robust errors to report just the LR test and decide if including or not random intercept/coefficients based on the latter?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#4

20 Mar 2019, 15:07

OK, I didn't pay attention to the robust standard errors. LR tests are not valid for models with robust VCE. Had you not gagged Stata by using the -force- option, it would have told you that in the first place. (General rule: avoid -force- options in any commands unless you have an overwhelmingly compelling reason to use them and have seriously considered the numerous ways in which they can lead you to disaster.)

Personally, I do not rely on this kind of hypothesis test to select models. So when you are caught between two approaches to hypothesis testing, neither of which I think is useful, it is hard for me to advise you on the choice. My own approach to this kind of situation is to simply inspect the sector-level random intercept output, and if the estimated variance at that level is large enough to matter for practical purposes, I include it. If it's so small that it is at best a rounding error, I drop it.
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#5

20 Mar 2019, 16:15

Dear Clyde, I get the idea of not using the force opcion. My following question is then, how much is large? Is there any rule of thumb?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#6

20 Mar 2019, 16:25

There is no rule of thumb because it is specific to the particular data and results. The general principal is to look at how much things can vary in the absence of the random intercepts at the sector level. Say, for example, one of the predictor variables is dichotomous and it has a coefficient of 7.5. Then you have potential variation as much as 7.5 from that one variable. For the moment, let's assume you have only that one predictor. Now, what is the standard deviation of the random intercept. If it's, say 0.01, then even a sector that is + or - 3 SD is only going to add .03 one way or the other, which is pretty negligible compared to 7.5. On the other hand, if the standard deviation of the sector intercepts were 2, even just a + or - 2SD sector could add or subtract 4, which is quite comparable to the effect of that fixed predictor. With continuous predictor variables, it's a little harder. You have to consider the range over which the predictor can vary. Even with a small coefficient, the variation contributed by a variable can be quite large if the variable has a large range. Similarly even with a large coefficient, the contribution of a variable can be tiny if the variable itself has a very narrow range of variation. So here you have to look at the product of the coefficient and the range of the variable itself and compare that to the intercept SD. Finally, when there are multiple predictors instead of just one, I compare the intercept SD to the variation contributed by the variable that contributes the least amount of variation. When the intercept SD is a tiny fraction of the variation attributable to any of the fixed effects, I omit it; if it's comparable I include it. You probably are thinking now, "how small is tiny?" My experience is that, in most analyses it's pretty obvious whether we're talking small or large. If you really have a borderline case, I would look at graphs of predicted vs observed outcomes for both models and see if there is a visibly noticeable difference in the fit of the two models. If neither the numerical comparisons nor the graphs really moves you in either direction, it probably makes no difference which way you go.
Comment

Damian Tojeiro

Join Date: Mar 2016
Posts: 86

22 Mar 2019, 07:56

Dear Clyde, again, thanks a lot for your explanation. I assumed the LR test because it is what some text books advice to do. When you refer to "how much things can vary in the absent of sectoral random intercept" is referred to the total model or just sectoral characteristics? I mean, do you also refer to how much firm characteristics vary without sectoral RE?
In your toy example, you said to look at the standard deviation of the random intercept. Please, let me know if I am wrong, but, some text books say not to use the standard error of the random part because some statistical issues. So, I am wondering if it is reliable to use it as you advice in the example.
Just to show you some preliminary results (new_xactin is sector and ident is firms id) with regard to the empty model (without the random intercept to the firm because it would be misspecified since the variable is not yet on the fixed part) and including some variables.
With regards to the predicted vs observed outcomes this seems a good idea, my next step is now to figure out how to do it.

Code:

Integration method: mvaghermite                 Integration pts.  =          7

                                                Wald chi2(0)      =          .
Log likelihood = -15795.032                     Prob > chi2       =          .
----------------------------------------------------------------------------------
         innprod |       Odds   Std. Err.      z    P>|z|     [95% Conf. Interval]
-----------------+----------------------------------------------------------------
           _cons |   4.752195   .5752172    12.88   0.000     3.748545    6.024566
-----------------+----------------------------------------------------------------
new_xactin       |
       var(_cons)|   .7391216   .1874259                      .4496441    1.214963
-----------------+----------------------------------------------------------------
new_xactin>ident |
       var(_cons)|   6.798422   .2909135                      6.251498    7.393195
----------------------------------------------------------------------------------
Note: Estimates are transformed only in the first equation.
LR test vs. logistic model: chi2(2) = 8589.67             Prob > chi2 = 0.0000

Code:

------------------------------------------------------------------------------------------
                 innprod | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
              offshoring |
                     L2. |   1.476632   .2278703     2.53   0.012     1.091232    1.998148
                         |
                 coopera |
                     L2. |   1.259742   .0689721     4.22   0.000      1.13156    1.402444
                         |
              internalRD |
                     L2. |   .8115264   .1209025    -1.40   0.161     .6060232    1.086716
                         |
               permanent |
                     L2. |   1.409765   .0870756     5.56   0.000     1.249026     1.59119
                         |
               empexport |
                     L2. |   1.305341   .1135474     3.06   0.002      1.10073    1.547987
                         |
                 L2.size |
       50-200 employees  |   1.360274   .1293034     3.24   0.001     1.129052    1.638849
      201-499 employees  |   1.358755   .2228581     1.87   0.062     .9852133    1.873925
  500 or more employees  |   2.398132   .5909977     3.55   0.000     1.479457    3.887263
                         |
               li_tamano |
                     L2. |   1.718396   .5728601     1.62   0.104     .8940439     3.30284
                         |
             lnfirm_year |   2.477632     .63532     3.54   0.000      1.49889    4.095471
                         |
                 foreign |
                     L2. |   .6903056   .1050962    -2.43   0.015     .5122116    .9303222
                         |
             Boffshoring |
                     L2. |   2.252019   .7101236     2.57   0.010      1.21385    4.178105
                         |
                Bcoopera |
                     L2. |   3.612085   .5042106     9.20   0.000     2.747507    4.748726
                         |
             BinternalRD |
                     L2. |   .7566068   .2652906    -0.80   0.426       .38055     1.50428
                         |
              Bpermanent |
                     L2. |   3.783621   .5436291     9.26   0.000     2.855008    5.014272
                         |
              Bempexport |
                     L2. |   1.507892   .2306786     2.68   0.007     1.117256     2.03511
                         |
                   Bsize |
                     L2. |   1.002996   .0943634     0.03   0.975     .8340974    1.206095
                         |
              Bli_tamano |
                     L2. |   50.11761   37.36694     5.25   0.000     11.62378    216.0894
                         |
            Blnfirm_year |   .5182797   .1273858    -2.67   0.007     .3201481    .8390299
                         |
                Bforeign |
                     L2. |   1.236173   .2804795     0.93   0.350     .7924047    1.928463
                         |
                    year |
                   2008  |    1.38634   .0972364     4.66   0.000      1.20828    1.590641
                   2009  |   1.752979   .1358349     7.24   0.000     1.505979    2.040491
                   2010  |   2.155875   .1875166     8.83   0.000     1.817969    2.556586
                   2011  |   .5619744   .0502511    -6.44   0.000     .4716317    .6696224
                   2012  |   .3369766   .0328964   -11.14   0.000     .2782938    .4080336
                   2013  |    .326922   .0347454   -10.52   0.000     .2654472    .4026337
                   2014  |   .3328917   .0381729    -9.59   0.000     .2658858    .4167838
                   2015  |   .3518506   .0433289    -8.48   0.000     .2763991     .447899
                         |
                   _cons |   .1594906   .0448785    -6.52   0.000     .0918795    .2768545
-------------------------+----------------------------------------------------------------
new_xactin               |
               var(_cons)|   .5739675   .1542652                      .3389299    .9719965
-------------------------+----------------------------------------------------------------
new_xactin>ident         |
       var(L2.offshoring)|   2.143627   .6372619                      1.197022    3.838806
               var(_cons)|   6.999104    .303869                      6.428168     7.62075
-------------------------+----------------------------------------------------------------
new_xactin>ident         |
 cov(L2.offshoring,_cons)|  -.0585711   .1730978    -0.34   0.735    -.3978366    .2806943
------------------------------------------------------------------------------------------

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#8

22 Mar 2019, 12:07

So, looking at the output for the second model, the variance estimate for new_xactin is about 0.574, which corresponds to an s.d. of about 0.76. Now, this is not directly comparable to the rest of the output because the rest is in the odds ratio metric. But we can easily convert that to coefficients by taking logarithms. Let's use L2.coopera as an example: its odds ratio is about 1.26, so the corresponding coefficient is about 0.23. So you can see that a 1sd difference in the sector random effect (and presumably there will be ample numbers of sectors that are 1sd out on that effect, because, after all, it's only 1 sd), has an effect on your outcome that is a little more than twice as large as the effect of a unit difference in coopera. Now, I don't know what kind of variable coopera is. So I would say that it is clear that the sector level random effects are an important part of the model. If, on the other hand, coopera is a continuous variable on a scale from 0 to 1000 (I just made that up for an example), then then differences in 0.23*coopera are of the order of magnitude of 1000*0.23, which is in the ballpark of 230. So even a 5 sd outlier on the sector intercept only contributes 0.574*5, or about 3, so it's only a little over 1% as impactful. If all the variables work out that way, then I think we could say that the sector intercept is probably ignorable.

Just how reliable the standard deviations of random effects are is complicated. It depends a great deal on the number of levels of that variable. It also depends on how many integration points were used in estimating the model. I'm not aware of any rules of thumb to apply. Certainly I would not do anything that relied on those parameter estimates being precise, even as precise as their standard errors make them appear to be. But, as you can see, what I'm proposing here is really looking at order of magnitude estimates, and unless you have only a small handful of sectors, in which case you should not be using them as random effects anyway, the precision will be good enough for this purpose.
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#9

22 Mar 2019, 12:18

Ok Clyde, now I got it, you were talking about converting the variance to the standard deviation and not about the standard error of the parameter, sorry. After your explanation everything is clear now. In fact the variable you took is a dummy, so I think is necessary to include the RE.
Any hint on how to do (or what to read) for doing the predicted vs the outcome? I mean, the predicted must to include also the random effects, right? So, it is not as in the OLS case option of "predit, xb" I assume.
And as always, thanks a lot for your help with multilevel stuff.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#10

22 Mar 2019, 12:30

For comparing observed and expected outcomes in a logistic model, you need to do something like the Hosmer-Lemeshow test of calibration. That's available after a one-level -logit- as -estat gof-, but for the multilevel model you have to code it yourself. See #4 at https://www.statalist.org/forums/for...-after-melogit for an example of how to do it.
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#11

22 Mar 2019, 14:06

Thanks for the recommendation Clyde, I assume that it should not posit any difference the fact of having three instead of two levels as in that example you refer. I will try it and let you know.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#12

22 Mar 2019, 14:16

The code given there should work with any number of levels.
Comment
Damian Tojeiro

Join Date: Mar 2016

Posts: 86
#13

26 Mar 2019, 10:01

Dear Clyde, here again. Just to show you the graph you advice me to do about predicted vs outcome. Using your code I have for 1000:

Code:

sum difference if decile<=1000 Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- difference | 1,000 -.6225122 2.308956 -14.40255 6.226903

Correct me if I am wrong, but would you say that it is a bad model? I mean, looking at the average difference is very low, right? And even though the dots are not fully in the line, it is also true that the scale of the graph is not in hundreds.
How would you interpret these?

Thanks in advance.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30119
#14

26 Mar 2019, 12:49

The graph looks good, but not excellent. It is clear that when your model predicts a relatively low IP_hat, in the range of 0 to 30, the observed probability is typically even lower still. The average difference overall is fairly small because that systematic discrepancy in the low range of predicted values is mostly offset by having the observed values being typically even higher than the predicted value when the predicted value is > 30. Now, the discrepancy is not very great, and may be small enough for practical purposes--that is a value judgment that I cannot make because I have no knowledge in this content area, but you probably can make. If you decide that this discrepancy is larger than you are comfortable with, then this pattern of discrepancies often can be improved by adding an interaction term between some variables.

All of that said, at the start of this thread, the driving issue was weather or not to include a sector level random intercept in the model. So there should be two of these graphs: one from the model with the sector intercept, and one without. The question then is not so much how good each of those graphs looks, but rather, which of the two looks better, or whether they even are visibly different at all.
Comment

Announcement