Model to analyse an experimental panel data set

Tobias Danzeisen

Join Date: May 2018
Posts: 2

Model to analyse an experimental panel data set

28 May 2018, 11:38

Dear Statalist,

I have conducted an economic experiment with repeated rounds. Therefore, I have panel data set with 50 subjects and 15 rounds. Now I am trying to analyse this panel data set. The dataset includes 3 time-invariant subjects specific variables (x1 - x3), 1 time-invariant treatment dummy (x4) and 3 time-variant variables (x5 - x7). The dependet variable is censored between 0 and 300.

I started with a simple pooled regression controlling for time effects:
reg y x1 x2 x3 x4 x5 x6 x7 i.round

Code:

. reg y x1 x2 x3 x4 x5 x6 x7 i.round

      Source |       SS           df       MS      Number of obs   =       750
-------------+----------------------------------   F(21, 728)      =      3.48
       Model |  322855.065        21  15374.0507   Prob > F        =    0.0000
    Residual |  3215544.94       728  4416.95733   R-squared       =    0.0912
-------------+----------------------------------   Adj R-squared   =    0.0650
       Total |     3538400       749  4724.16555   Root MSE        =     66.46

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |  -10.60428   8.537124    -1.24   0.215     -27.3646    6.156038
          x2 |   2.597192   6.721262     0.39   0.699    -10.59818    15.79256
          x3 |  -2.251722   .6952007    -3.24   0.001    -3.616559   -.8868843
          x4 |   16.80329   7.249525     2.32   0.021     2.570817    31.03576
          x5 |  -26.12149   5.360657    -4.87   0.000    -36.64568   -15.59729
          x6 |   17.35713   6.221222     2.79   0.005     5.143452    29.57081
          x7 |  -.0196991   .0065702    -3.00   0.003    -.0325979   -.0068003
             |
       round |
          2  |  -5.637439   13.53217    -0.42   0.677    -32.20417    20.92929
          3  |   18.22192   13.81219     1.32   0.187     -8.89457     45.3384
          4  |   8.938925   14.32762     0.62   0.533    -19.18945     37.0673
          5  |   21.59928   15.05858     1.43   0.152    -7.964136     51.1627
          6  |    30.6742   15.82739     1.94   0.053    -.3985806    61.74698
          7  |   39.86921   16.84672     2.37   0.018     6.795267    72.94316
          8  |   38.67904   17.92418     2.16   0.031     3.489776     73.8683
          9  |   49.75817   19.20568     2.59   0.010     12.05305     87.4633
         10  |   62.44235   20.37182     3.07   0.002     22.44782    102.4369
         11  |   77.21779   21.85422     3.53   0.000     34.31298    120.1226
         12  |   84.49857   23.11455     3.66   0.000     39.11944    129.8777
         13  |   70.84038    24.4438     2.90   0.004     22.85164    118.8291
         14  |   89.78732   26.00041     3.45   0.001     38.74258    140.8321
         15  |   99.18013   27.38591     3.62   0.000     45.41534    152.9449
             |
       _cons |   82.16068   12.48479     6.58   0.000     57.65019    106.6712
------------------------------------------------------------------------------

This would lead to results which are mainly in line with my expectations, however, I think this model is not adequate. I can exclude a fixed effect model, because of my time-invariant regressors. So I continued with a random effect model:
xtreg y x1 x2 x3 x4 x5 x6 x7 i.round, re

Code:

. xtreg y x1 x2 x3 x4 x5 x6 x7 i.round, re

Random-effects GLS regression                   Number of obs     =        750
Group variable: ID                              Number of groups  =         50

R-sq:                                           Obs per group:
     within  = 0.0899                                         min =         15
     between = 0.0854                                         avg =       15.0
     overall = 0.0839                                         max =         15

                                                Wald chi2(21)     =      71.09
corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |  -10.65852   23.40097    -0.46   0.649    -56.52357    35.20653
          x2 |   3.675693   18.35401     0.20   0.841     -32.2975    39.64889
          x3 |  -1.981723   1.890282    -1.05   0.294    -5.686607    1.723162
          x4 |   4.738529   14.39209     0.33   0.742    -23.46945    32.94651
          x5 |  -20.83174   4.446905    -4.68   0.000    -29.54752   -12.11597
          x6 |   13.82984   4.788612     2.89   0.004     4.444334    23.21535
          x7 |  -.0045219   .0062786    -0.72   0.471    -.0168278     .007784
             |
       round |
          2  |  -8.784793   10.26244    -0.86   0.392     -28.8988    11.32921
          3  |   10.69709   10.61675     1.01   0.314    -10.11135    31.50554
          4  |  -2.085918   11.20945    -0.19   0.852    -24.05604     19.8842
          5  |   6.444364   12.04317     0.54   0.593    -17.15981    30.04853
          6  |   11.80622   12.91437     0.91   0.361    -13.50549    37.11792
          7  |   17.79786   13.99831     1.27   0.204    -9.638315    45.23404
          8  |   12.25996   15.18692     0.81   0.420    -17.50585    42.02577
          9  |   19.01789   16.55381     1.15   0.251    -13.42699    51.46276
         10  |   27.30547   17.81164     1.53   0.125    -7.604696    62.21564
         11  |    37.9876   19.32481     1.97   0.049     .1116635    75.86354
         12  |   41.37377   20.62967     2.01   0.045     .9403665    81.80718
         13  |   24.32797   21.97341     1.11   0.268    -18.73912    67.39506
         14  |   39.35765   23.52152     1.67   0.094    -6.743691    85.45899
         15  |   45.02379   24.91886     1.81   0.071    -3.816283    93.86386
             |
       _cons |   81.30505   24.22192     3.36   0.001     33.83095    128.7791
-------------+----------------------------------------------------------------
     sigma_u |  45.331774
     sigma_e |  50.089668
         rho |  .45026176   (fraction of variance due to u_i)
------------------------------------------------------------------------------

Then I conducted the Breusch and Pagan Lagrange- multiplier test with the stata command xttest0. The result indicates that I can reject the H0, thus I infer from that that the pooled OLS model is not appropriate and the random effect model is a better fit.

Lastly, I used a random-effects tobit model (xttobit y x1 x2 x3 x4 x5 x6 x7 i.round, ul(300) ll(0) ) to incorporate the censoring of the dependent variable. The results are quite identical to the random effect model. So I would infer that the random-effect model or the random-effects tobit model is the most adequate model to analyse this dataset. However, that would also mean that most of the regressors are insignificant.

I want to ask you whether my proceeding and my inference is correct or whether I made a mistake somewhere. I am thankful every comment!

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#2

28 May 2018, 11:49

Tobias:
welcome to this forum.
Your pooled OLS results are not reliable, as you standard errors were not clustered on -panelid-. Taken as it is, your pooled OLS is actually an OLS with 750 independent observations (which is not the case, as you have 50 different panels).
Notwithstanding your concerns about time-invariant predictors, I would perform a -hausman- test to compare -fe- vs -re- specification (as it usually follows -xttest0-; see the example in -help xtest0-).

Kind regards,
Carlo
(Stata 19.0)
Comment

Tobias Danzeisen

Join Date: May 2018
Posts: 2

28 May 2018, 12:59

Dear Carlo,

thank you for your response! I conducted the hausman test with the following result:

Code:

. hausman fe re

Note: the rank of the differenced variance matrix (16) does not equal the number of coefficients being tested (17); be sure this is what
        you expect, or there may be problems computing the test.  Examine the output of your estimators for anything unexpected and
        possibly consider scaling your variables so that the coefficients are on a similar scale.

                 ---- Coefficients ----
             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
             |       fe           re         Difference          S.E.
-------------+----------------------------------------------------------------
          x5 |   -20.19145    -20.83174        .6402959        .5731315
          x6 |    13.53845     13.82984       -.2913932        .2110452
          x7 |   -.0023662    -.0045219        .0021557        .0014156
       round |
          2  |   -9.286331    -8.784793        -.501538        .0253312
          3  |    9.597223     10.69709       -1.099868        .6431579
          4  |    -3.69155    -2.085918       -1.605632        1.021576
          5  |    4.254581     6.444364       -2.189783        1.417601
          6  |    9.093235     11.80622       -2.712982        1.766573
          7  |    14.60423     17.79786       -3.193629        2.118984
          8  |    8.461372     12.25996       -3.798588        2.509239
          9  |     14.6075     19.01789       -4.410386        2.914544
         10  |    22.29404     27.30547       -5.011432        3.285128
         11  |    32.38269      37.9876        -5.60491        3.683311
         12  |    35.22432     41.37377       -6.149453        4.031606
         13  |    17.68822     24.32797        -6.63975        4.371876
         14  |    32.14943     39.35765       -7.208222        4.752474
         15  |    37.28797     45.02379       -7.735823        5.102445
------------------------------------------------------------------------------
                           b = consistent under Ho and Ha; obtained from xtreg
            B = inconsistent under Ha, efficient under Ho; obtained from xtreg

    Test:  Ho:  difference in coefficients not systematic

                 chi2(16) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =        5.41
                Prob>chi2 =      0.9933
                (V_b-V_B is not positive definite)

I am not sure, whether I can simply use the Hausman test because the time invariant regressors are omitted in the FE model. So, I additionally conducted the Mundlak approach (https://blog.stata.com/2015/10/29/fixed-effects-or-random-effects-the-mundlak-approach/):

Code:

. quietly xtreg y x1 x2 x3 x4 mean_x5 mean_x6 mean_x7 i.round, vce(robust)

. test mean_x5 mean_x6 mean_x7

 ( 1)  mean_x5 = 0
 ( 2)  mean_x6 = 0
 ( 3)  mean_x7 = 0

           chi2(  3) =    7.63
         Prob > chi2 =    0.0543

So this means I can use the RE model, correct?

Announcement

Model to analyse an experimental panel data set

Comment

Comment