Pooled OLS with interaction on almost all explanatory variables

Sven Niederhoefer

Join Date: Aug 2020
Posts: 2

Pooled OLS with interaction on almost all explanatory variables

16 Sep 2020, 07:59

Dear Statalist,

I have panel data covering 763 firms over 15 years, taken from an industry consortium. I want to estimate how changes in the memberships across competing industry consortia, the number of simultaneous affiliations, the role within the focal consortium and the provision of a platform product (time-invariant) affect their product certifications. So the basic model would look like this:

productcerts_t = beta0 + beta1 * changemem_t-1 + beta2 * simulmem_t-1 + beta3 * role_t-1 + beta4 * platform + controls

While the model is rather straight forward, I am currently facing the issue that firms, in order to be able to certify products, are required to be members. Thus, I included a dummy variable member_t and its interaction terms with all other variables, except for role as it already requires member_t to be 1. However, that causes multicollinearity in a more complete model with all control variables and produces a large result set due to the interactions. The model then looks like this:

productcerts_t = beta0 + beta1 * changemem_t + beta2 * simulmem_t + beta3 * role_t + beta4 * platform + beta5 * member_t + beta6 * member_t * changemem_t + beta7 * member_t * simulmem_t + beta8 * member_t * platform + controls

I was wondering if there is a more elegant way that yields consistent results. Intuitively, I thought about filtering the observations, excluding all records where member_t == 0 and ran a pooled OLS with time dummies and clustered standard errors on id. But I am not sure if that is an appropriate approach.

Here are some results I computed:

1) pooled OLS with interactions and clustered standard errors

Code:

. reg productcerts i.member##c.L1.changemem i.member##c.L1.simulmem L1.role i.member##i.platform i.year, cluster(id)

Linear regression                               Number of obs     =     10,682
                                                F(21, 762)        =       5.34
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0789
                                                Root MSE          =     2.1651

                                            (Std. Err. adjusted for 763 clusters in id)
---------------------------------------------------------------------------------------
                      |               Robust
         productcerts |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------------+----------------------------------------------------------------
             1.member |   .4415453   .0704454     6.27   0.000     .3032552    .5798353
                      |
            changemem |
                  L1. |   .5532723   .5693309     0.97   0.331    -.5643709    1.670915
                      |
  member#cL.changemem |
                   1  |  -1.356654   .5850375    -2.32   0.021    -2.505131   -.2081776
                      |
             simulmem |
                  L1. |   .2238948   .1329231     1.68   0.093    -.0370441    .4848337
                      |
   member#cL.simulmem |
                   1  |   -.125708   .2387898    -0.53   0.599    -.5944721     .343056
                      |
                 role |
                  L1. |   3.496489   1.625868     2.15   0.032     .3047766    6.688201
                      |
           1.platform |   .1912549   .0916024     2.09   0.037      .011432    .3710779
                      |
      member#platform |
                 1 1  |    1.72839   .6695502     2.58   0.010     .4140079    3.042772
                      |
                 year |
                2007  |   .0248361    .064558     0.38   0.701    -.1018965    .1515687
                2008  |  -.0068504   .0425496    -0.16   0.872    -.0903788     .076678
                2009  |   .0131293   .0814032     0.16   0.872    -.1466718    .1729304
                2010  |  -.0718353   .0605614    -1.19   0.236    -.1907222    .0470516
                2011  |   .0194899   .0722105     0.27   0.787    -.1222652     .161245
                2012  |  -.0246552    .063305    -0.39   0.697    -.1489281    .0996177
                2013  |   .0549405   .0778698     0.71   0.481    -.0979243    .2078054
                2014  |  -.0239332    .068818    -0.35   0.728    -.1590286    .1111623
                2015  |   .1155241   .1268944     0.91   0.363      -.13358    .3646283
                2016  |   .1556162   .0833659     1.87   0.062     -.008038    .3192703
                2017  |   .2129104   .1003894     2.12   0.034     .0158378     .409983
                2018  |   .0882369   .0852473     1.04   0.301    -.0791104    .2555843
                2019  |   .2275257   .1756277     1.30   0.196    -.1172458    .5722973
                      |
                _cons |  -.0412679   .0527809    -0.78   0.435    -.1448811    .0623454
---------------------------------------------------------------------------------------

2) pooled OLS model with filtered observations, excluding records where member_t == 0

Code:

. reg productcerts c.L1.changemem c.L1.simulmem L1.role i.platform i.year if member, cluster(id)

Linear regression                               Number of obs     =      3,189
                                                F(17, 762)        =       4.06
                                                Prob > F          =     0.0000
                                                R-squared         =     0.0612
                                                Root MSE          =     3.7748

                                    (Std. Err. adjusted for 763 clusters in id)
-------------------------------------------------------------------------------
              |               Robust
 productcerts |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
    changemem |
          L1. |  -.8704477    .463063    -1.88   0.061    -1.779478     .038583
              |
     simulmem |
          L1. |   .0518086   .2144131     0.24   0.809    -.3691018    .4727191
              |
         role |
          L1. |   3.817154   1.762396     2.17   0.031     .3574265    7.276881
              |
   1.platform |   1.894842   .6702957     2.83   0.005     .5789965    3.210687
              |
         year |
        2007  |   .0040985   .5051178     0.01   0.994    -.9874892    .9956862
        2008  |  -.1097982   .3155404    -0.35   0.728      -.72923    .5096335
        2009  |  -.1420047    .441294    -0.32   0.748    -1.008301    .7242916
        2010  |  -.3944345    .415614    -0.95   0.343    -1.210319      .42145
        2011  |    .057463   .4554769     0.13   0.900    -.8366756    .9516016
        2012  |  -.2025696   .4350108    -0.47   0.642    -1.056531    .6513922
        2013  |   .1460105   .4491477     0.33   0.745    -.7357033    1.027724
        2014  |  -.0861011   .4196607    -0.21   0.837    -.9099294    .7377273
        2015  |   .3009069   .4922737     0.61   0.541    -.6654667    1.267281
        2016  |   .3548497   .4243433     0.84   0.403    -.4781711     1.18787
        2017  |   .3556781   .4230937     0.84   0.401    -.4748896    1.186246
        2018  |   .1858268   .4198763     0.44   0.658    -.6384248    1.010078
        2019  |   .5825402   .5546162     1.05   0.294    -.5062169    1.671297
              |
        _cons |   .3286396   .4078965     0.81   0.421    -.4720947    1.129374
-------------------------------------------------------------------------------

The second model shows slight changes in the coefficient estimates.

Code:

. quietly: xtreg productcerts i.member##c.L1.changemem i.member##c.L1.simulmem L1.role i.member##i.platform i.year
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects

        productcerts[id,t] = Xb + u[id] + e[id,t]

        Estimated results:
                         |       Var     sd = sqrt(Var)
                ---------+-----------------------------
               product~s |   5.079473       2.253769
                       e |     4.0121       2.003023
                       u |   .6070492       .7791336

        Test:   Var(u) = 0
                             chibar2(01) =  1353.13
                          Prob > chibar2 =   0.0000

Further, the Breusch-Pagan ML test favors a model with random effects, Hausman Test and suest cannot be run on the data/models.

I would appreciate, If you could give me some suggestion how to succeed with this problematic. Would you recommend to stick with an RE/FE model and use the interactions? Is it legit under some assumptions to filter observations for pooled OLS? Or is there any other approach?

Best,
Sven

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17712
#2

16 Sep 2020, 10:21

Sven:
welcome to this forum.
The higher the number of interactions in the right-hand side of your regression equation (panel or not), the (exponentiated) higher the difficulty you find in conveying the results of your research. The safest choice is to focus on the predictors that can give a true and fair view of the data generating process that you're investigating.
That said, whenever you invoke non-default standard errors, -hausman- is not your friend for comparing -fe- vs -re-, as it supports default standard errors only.
Hence, you should switch to the community-contributed command -xtoverid-, that, being a bit old-fashioned, does not support -fvvarlist- notation, though. The usual fix is to prefix your -xtreg- code with -xi:- and/or creating interactions by hand (like in the old days).
I would not consider pooled OLS due to the evidence of a panelwise effect.
In addition, filtering observations can be seen as a way of making-up data (hence, it may be difficult to defend in front of a reviewer).

Last edited by Carlo Lazzaro; 16 Sep 2020, 10:23.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Sven Niederhoefer

Join Date: Aug 2020

Posts: 2
#3

18 Sep 2020, 05:00

Thank you very much. It helped me a lot!

Best,
Sven
Comment

Announcement

Pooled OLS with interaction on almost all explanatory variables

Comment

Comment