Test whether multi-level analysis is needed

Franz Hopp

Join Date: Feb 2015
Posts: 42

Test whether multi-level analysis is needed

10 Oct 2020, 06:48

Hi Statalist members,

I was wondering whether there is any explicit test that can tell me if I need a multi-level analysis?

Is it correct to check for the percentage of the variance in the outcome variable that is attributable to “membership” in the group in which the observations are nested? For example: The observations in my study are business units (over ~15 years); and these business units are nested in firms. I study the influence of a certain strategy of a business unit (independent variable of interest) on the financial performance of the business unit (dependent variable).

That is, I checked for the % of the variance in the financial performance of business units (DV) that is attributable to firm membership. If that number turns out to be very low, a multi-level model may not necessarily be needed?

More specifically, I did the following:

First, I ran the following Stata code. BU_performance is the financial performance of the business unit; and Firm_ID is the identifier of the firm to which the business unit belongs:

Code:

mixed BU_performance || Firm_ID:, var

The result of this is:

Code:

Mixed-effects ML regression                     Number of obs     =      4,496
Group variable: Firm_ID                          Number of groups  =        280

                                                Obs per group:
                                                              min =          1
                                                              avg =       16.1
                                                              max =         65

                                                Wald chi2(0)      =          .
Log likelihood =  1484.6113                     Prob > chi2       =          .

--------------------------------------------------------------------------------
BU_performance |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
         _cons |  -.0047854   .0025938    -1.84   0.065    -.0098692    .0002985
--------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
Firm_ID: Identity             |
                  var(_cons) |   4.34e-22   1.28e-21      1.33e-24    1.42e-19
-----------------------------+------------------------------------------------
               var(Residual) |    .030249    .000638      .0290241    .0315257
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 0.00          Prob >= chibar2 = 1.0000

Then, I computed the following:

Code:

var(_cons)   /   (var(_cons) + var(Residual)) =
= 4.34e-22 / (4.34e-22 + 0.030249) =
=~ 0.0000

From the above, it seems that less than 1% of the variance in the financial performance of business units is attributable to firm membership? This seems like a very low value. I have heard that anything above 5% definitely requires a multi-level model, but values lower than 1% may not.

Thanks so much for any advice on whether the above approach is alright.

Franz

Last edited by Franz Hopp; 10 Oct 2020, 07:00.

Tags: None

Joseph Coveney

Join Date: Apr 2014

Posts: 4420
#2

10 Oct 2020, 07:45

You mention some kind of business strategy as the predictor of interest, but I don't see it in the model. What's the variance component look like in the complete model?

Is this some kind of cross-sectional snapshot of business performance of 4500 divisions of 280 holding companies (an average of 16 "business units" per "firm")? I wouldn't expect much synergy between acquisitions of a holding company, but who knows, maybe there's a cultural factor that varies among them in the level of suppressing innovation in business strategy of vacuumed-up companies.
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#3

10 Oct 2020, 08:05

Disclaimer: I found this article (accepted manuscript, really) only moments ago, so have barely skimmed it at this point. But Myth #1 may be of interest. ;-)
http://faculty.missouri.edu/huangf/d...s_complete.pdf

HTH.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#4

10 Oct 2020, 10:18

Franz Hopp the intraclass correlation can be calculated by Stata postestimation and will give you the proportion of variance in the outcome that is due to the nesting: https://www.stata.com/features/overv...ilevel-models/. That is how you determine whether you need to account for the nesting in the data
Comment
Bruce Weaver

Join Date: May 2014

Posts: 1133
#5

10 Oct 2020, 17:05

Tom Scott , the author of the manuscript I linked to in #3 takes a different view. From that manuscript:
Myth 1: When the intraclass correlation is low, multilevel modeling is not needed.

--
Bruce Weaver
Email: [email protected]
Version: Stata/MP 19.5 (Windows)
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#6

10 Oct 2020, 20:35

Good to know. Thanks Bruce Weaver. I wonder if you could still account for Type I error by using clustered standard errors while running a simpler regression model? I'm not an expert on MLM so there might be other benefits to using a MLM even when the ICC is low. There could also be drawbacks to using it when it's not necessary.
Comment
Tom Scott

Join Date: Apr 2019

Posts: 266
#7

10 Oct 2020, 21:00

This is from Maas & Hox (2004) in Computational Statistics & Data Analysis:

"In general, what is at issue in multilevel modeling is not so much the ICC, but the design effect, which indicates how much the standard errors are underestimated (Kish, 1965). In cluster samples, the design effect is approximately equal to 1+(average cluster size-1)*ICC. If the design effect is smaller than two, using single-level analysis on multilevel data does not seem to lead to overly misleading results (Muthten and Satorra, 1995)."

It's always a good exercise to run it both ways as a sensitivity analysis if you can't come to a clear conclusion on the correct model to run.

Tom
1 like
Comment

Announcement

Test whether multi-level analysis is needed

Comment

Comment

Comment

Comment

Comment

Comment