OLS vs. Multi-level model query

Sean O'Connor

Join Date: Jun 2014
Posts: 119

OLS vs. Multi-level model query

10 Nov 2016, 08:10

Folks,

I've a query in regards to the selection of the aforementioned model. I am examining the impact firm and regional covariates can have on firm performance. I would argue that given that I have firm level data imbedded within regions that utilising a multi-level model to address the hierarchical nature of the data would be more appropriate.

At the firm level I have data on the cultural composition of employees within the firm, the distance the firm is to the capital city of its country and information on the tenure of the manager.

At the regional level I have information on regional wealth, population size, and population density.

What I'm finding however, according to the LR test results is that the multilevel model isn't a better fit for the data than an OLS.

I find this hard to believe given what we have is firms imbedded within regions. Perhaps I'm misspecifying the multi-level model, or could anyone recommend something I should be correcting for etc?

Code:

 reg logpoints logsimpson_b loggdp logpop logdensity logtimemanager  inversedistance i.league, robust

Linear regression                               Number of obs     =        134
                                                F(12, 121)        =       5.06
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2930
                                                Root MSE          =     .29209

---------------------------------------------------------------------------------
                |               Robust
      logpoints |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
   logsimpson_b |   .6217996   .1938224     3.21   0.002      .238077    1.005522
         loggdp |   .0922478   .1154515     0.80   0.426    -.1363189    .3208145
         logpop |   .1059771   .0285484     3.71   0.000      .049458    .1624963
     logdensity |   .0113455   .0457175     0.25   0.804    -.0791645    .1018554
 logtimemanager |   .0544615   .0215614     2.53   0.013     .0117751     .097148
inversedistance |   1.656778   .4969604     3.33   0.001     .6729134    2.640642
                |
         league |
        France  |   .1495962   .0988801     1.51   0.133    -.0461631    .3453555
       Germany  |   .0145193   .0995747     0.15   0.884    -.1826151    .2116537
         Italy  |  -.0085604   .1069785    -0.08   0.936    -.2203526    .2032318
   Netherlands  |   .0776461   .1209717     0.64   0.522    -.1618494    .3171415
      Portugal  |   .0595672   .1024641     0.58   0.562    -.1432875     .262422
         Spain  |   .0059637   .1083742     0.06   0.956    -.2085915     .220519
                |
          _cons |  -.9424459   1.425794    -0.66   0.510    -3.765181     1.88029
---------------------------------------------------------------------------------

Code:

. mixed logpoints logsimpson_b loggdp logpop logdensity logtimemanager inversedistance  i.league , || region:

Performing EM optimization:

Performing gradient-based optimization:

Iteration 0:   log likelihood = -18.861665 
Iteration 1:   log likelihood = -18.342474 
Iteration 2:   log likelihood = -18.342463 
Iteration 3:   log likelihood = -18.342463 

Computing standard errors:

Mixed-effects ML regression                     Number of obs     =        134
Group variable: region                          Number of groups  =         72

                                                Obs per group:
                                                              min =          1
                                                              avg =        1.9
                                                              max =          8

                                                Wald chi2(12)     =      55.03
Log likelihood = -18.342463                     Prob > chi2       =     0.0000

---------------------------------------------------------------------------------
      logpoints |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
   logsimpson_b |   .6147686   .1886914     3.26   0.001     .2449403    .9845969
         loggdp |   .0883351    .118281     0.75   0.455    -.1434915    .3201616
         logpop |   .1049657   .0358002     2.93   0.003     .0347986    .1751328
     logdensity |   .0130486   .0390213     0.33   0.738    -.0634317    .0895289
 logtimemanager |    .054269   .0209066     2.60   0.009     .0132927    .0952452
inversedistance |   1.699659   .4638576     3.66   0.000     .7905147    2.608803
                |
         league |
        France  |     .15391   .1232184     1.25   0.212    -.0875937    .3954137
       Germany  |   .0188387   .1094834     0.17   0.863    -.1957447    .2334222
         Italy  |  -.0050063   .1175322    -0.04   0.966    -.2353652    .2253527
   Netherlands  |   .0792474    .099756     0.79   0.427    -.1162708    .2747655
      Portugal  |   .0504112   .1059869     0.48   0.634    -.1573192    .2581416
         Spain  |   .0089773   .1191453     0.08   0.940    -.2245431    .2424977
                |
          _cons |  -.9203046    1.53357    -0.60   0.548    -3.926046    2.085437
---------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
region: Identity             |
                  var(_cons) |   .0024372    .008268      3.16e-06    1.881563
-----------------------------+------------------------------------------------
               var(Residual) |   .0746093   .0120441      .0543732    .1023769
------------------------------------------------------------------------------
LR test vs. linear model: chibar2(01) = 0.09          Prob >= chibar2 = 0.3816

Tags: None

Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

10 Nov 2016, 08:45

It seems you have a small sample size, even under the OLS model, considering the number of predictors selected. Besides too many groups in the mixed model, leading to an average of just 1.9 observations per region.

Best regards,

Marcos
Comment
Sean O'Connor

Join Date: Jun 2014

Posts: 119
#3

10 Nov 2016, 09:44

Hi Marcos,

Thank you for you reply. At the moment my regional level of the model is NUTS 2 data. Would you advise perhaps to move to a more aggregated regional level NUTS level 1?

I can't increase the sample size as my sample is based on football teams who took part in the 2015/2016 season across these 7 countries.

I suppose my issue is more in regards for a journal who may deem an OLS estimation too simplistic but I suppose if I show I tested other models it my suffice.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#4

11 Nov 2016, 01:57

Sean:
I would go OLS.
-mixed- would be theoretically more appealing, but you have a too limited sample size (and, as Marcos said, quite too many predictors) to follow that road.

Kind regards,
Carlo
(Stata 19.0)
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#5

11 Nov 2016, 06:08

I agree with Carlo. Besides, so many groups, what is more, having less than 2 observations per group, is practically a presentation of the observations themselves. That may well be the reason for the similarity of results between models.

Best regards,

Marcos
Comment
Sean O'Connor

Join Date: Jun 2014

Posts: 119
#6

11 Nov 2016, 06:15

Thank you both for your comments. Having read Kreft (1996) it seems she suggests a rule of thumb of 30/30. A sample of 30 groups with 30 individuals per group.

I would also recommend this online piece by J Hox which gives a good discussion of Multilevel models.

http://www.joophox.net/publist/whenwhy.pdf

Reference;
Kreft, Ita G.G. (1996) Are Multilevel Techniques Necessary? An Overview, Including Simulation Studies. California State University, Los Angeles.

Last edited by Sean O'Connor; 11 Nov 2016, 07:04.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17685
#7

11 Nov 2016, 06:57

Sean:
full reference of Krefts (1996) would be appreciated. Thanks.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

OLS vs. Multi-level model query

Comment

Comment

Comment

Comment

Comment

Comment