Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • OLS vs. Multi-level model query

    Folks,

    I've a query in regards to the selection of the aforementioned model. I am examining the impact firm and regional covariates can have on firm performance. I would argue that given that I have firm level data imbedded within regions that utilising a multi-level model to address the hierarchical nature of the data would be more appropriate.

    At the firm level I have data on the cultural composition of employees within the firm, the distance the firm is to the capital city of its country and information on the tenure of the manager.

    At the regional level I have information on regional wealth, population size, and population density.

    What I'm finding however, according to the LR test results is that the multilevel model isn't a better fit for the data than an OLS.

    I find this hard to believe given what we have is firms imbedded within regions. Perhaps I'm misspecifying the multi-level model, or could anyone recommend something I should be correcting for etc?


    Code:
     reg logpoints logsimpson_b loggdp logpop logdensity logtimemanager  inversedistance i.league, robust
    
    Linear regression                               Number of obs     =        134
                                                    F(12, 121)        =       5.06
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.2930
                                                    Root MSE          =     .29209
    
    ---------------------------------------------------------------------------------
                    |               Robust
          logpoints |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
       logsimpson_b |   .6217996   .1938224     3.21   0.002      .238077    1.005522
             loggdp |   .0922478   .1154515     0.80   0.426    -.1363189    .3208145
             logpop |   .1059771   .0285484     3.71   0.000      .049458    .1624963
         logdensity |   .0113455   .0457175     0.25   0.804    -.0791645    .1018554
     logtimemanager |   .0544615   .0215614     2.53   0.013     .0117751     .097148
    inversedistance |   1.656778   .4969604     3.33   0.001     .6729134    2.640642
                    |
             league |
            France  |   .1495962   .0988801     1.51   0.133    -.0461631    .3453555
           Germany  |   .0145193   .0995747     0.15   0.884    -.1826151    .2116537
             Italy  |  -.0085604   .1069785    -0.08   0.936    -.2203526    .2032318
       Netherlands  |   .0776461   .1209717     0.64   0.522    -.1618494    .3171415
          Portugal  |   .0595672   .1024641     0.58   0.562    -.1432875     .262422
             Spain  |   .0059637   .1083742     0.06   0.956    -.2085915     .220519
                    |
              _cons |  -.9424459   1.425794    -0.66   0.510    -3.765181     1.88029
    ---------------------------------------------------------------------------------
    Code:
    . mixed logpoints logsimpson_b loggdp logpop logdensity logtimemanager inversedistance  i.league , || region:
    
    Performing EM optimization:
    
    Performing gradient-based optimization:
    
    Iteration 0:   log likelihood = -18.861665 
    Iteration 1:   log likelihood = -18.342474 
    Iteration 2:   log likelihood = -18.342463 
    Iteration 3:   log likelihood = -18.342463 
    
    Computing standard errors:
    
    Mixed-effects ML regression                     Number of obs     =        134
    Group variable: region                          Number of groups  =         72
    
                                                    Obs per group:
                                                                  min =          1
                                                                  avg =        1.9
                                                                  max =          8
    
                                                    Wald chi2(12)     =      55.03
    Log likelihood = -18.342463                     Prob > chi2       =     0.0000
    
    ---------------------------------------------------------------------------------
          logpoints |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ----------------+----------------------------------------------------------------
       logsimpson_b |   .6147686   .1886914     3.26   0.001     .2449403    .9845969
             loggdp |   .0883351    .118281     0.75   0.455    -.1434915    .3201616
             logpop |   .1049657   .0358002     2.93   0.003     .0347986    .1751328
         logdensity |   .0130486   .0390213     0.33   0.738    -.0634317    .0895289
     logtimemanager |    .054269   .0209066     2.60   0.009     .0132927    .0952452
    inversedistance |   1.699659   .4638576     3.66   0.000     .7905147    2.608803
                    |
             league |
            France  |     .15391   .1232184     1.25   0.212    -.0875937    .3954137
           Germany  |   .0188387   .1094834     0.17   0.863    -.1957447    .2334222
             Italy  |  -.0050063   .1175322    -0.04   0.966    -.2353652    .2253527
       Netherlands  |   .0792474    .099756     0.79   0.427    -.1162708    .2747655
          Portugal  |   .0504112   .1059869     0.48   0.634    -.1573192    .2581416
             Spain  |   .0089773   .1191453     0.08   0.940    -.2245431    .2424977
                    |
              _cons |  -.9203046    1.53357    -0.60   0.548    -3.926046    2.085437
    ---------------------------------------------------------------------------------
    
    ------------------------------------------------------------------------------
      Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
    -----------------------------+------------------------------------------------
    region: Identity             |
                      var(_cons) |   .0024372    .008268      3.16e-06    1.881563
    -----------------------------+------------------------------------------------
                   var(Residual) |   .0746093   .0120441      .0543732    .1023769
    ------------------------------------------------------------------------------
    LR test vs. linear model: chibar2(01) = 0.09          Prob >= chibar2 = 0.3816

  • #2
    It seems you have a small sample size, even under the OLS model, considering the number of predictors selected. Besides too many groups in the mixed model, leading to an average of just 1.9 observations per region.
    Best regards,

    Marcos

    Comment


    • #3
      Hi Marcos,

      Thank you for you reply. At the moment my regional level of the model is NUTS 2 data. Would you advise perhaps to move to a more aggregated regional level NUTS level 1?

      I can't increase the sample size as my sample is based on football teams who took part in the 2015/2016 season across these 7 countries.

      I suppose my issue is more in regards for a journal who may deem an OLS estimation too simplistic but I suppose if I show I tested other models it my suffice.

      Comment


      • #4
        Sean:
        I would go OLS.
        -mixed- would be theoretically more appealing, but you have a too limited sample size (and, as Marcos said, quite too many predictors) to follow that road.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment


        • #5
          I agree with Carlo. Besides, so many groups, what is more, having less than 2 observations per group, is practically a presentation of the observations themselves. That may well be the reason for the similarity of results between models.
          Best regards,

          Marcos

          Comment


          • #6
            Thank you both for your comments. Having read Kreft (1996) it seems she suggests a rule of thumb of 30/30. A sample of 30 groups with 30 individuals per group.

            I would also recommend this online piece by J Hox which gives a good discussion of Multilevel models.

            http://www.joophox.net/publist/whenwhy.pdf

            Reference;
            Kreft, Ita G.G. (1996) Are Multilevel Techniques Necessary? An Overview, Including Simulation Studies. California State University, Los Angeles.
            Last edited by Sean O'Connor; 11 Nov 2016, 07:04.

            Comment


            • #7
              Sean:
              full reference of Krefts (1996) would be appreciated. Thanks.
              Kind regards,
              Carlo
              (Stata 18.0 SE)

              Comment

              Working...
              X