Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Model to analyse an experimental panel data set

    Dear Statalist,

    I have conducted an economic experiment with repeated rounds. Therefore, I have panel data set with 50 subjects and 15 rounds. Now I am trying to analyse this panel data set. The dataset includes 3 time-invariant subjects specific variables (x1 - x3), 1 time-invariant treatment dummy (x4) and 3 time-variant variables (x5 - x7). The dependet variable is censored between 0 and 300.

    I started with a simple pooled regression controlling for time effects:
    reg y x1 x2 x3 x4 x5 x6 x7 i.round

    Code:
    . reg y x1 x2 x3 x4 x5 x6 x7 i.round
    
          Source |       SS           df       MS      Number of obs   =       750
    -------------+----------------------------------   F(21, 728)      =      3.48
           Model |  322855.065        21  15374.0507   Prob > F        =    0.0000
        Residual |  3215544.94       728  4416.95733   R-squared       =    0.0912
    -------------+----------------------------------   Adj R-squared   =    0.0650
           Total |     3538400       749  4724.16555   Root MSE        =     66.46
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              x1 |  -10.60428   8.537124    -1.24   0.215     -27.3646    6.156038
              x2 |   2.597192   6.721262     0.39   0.699    -10.59818    15.79256
              x3 |  -2.251722   .6952007    -3.24   0.001    -3.616559   -.8868843
              x4 |   16.80329   7.249525     2.32   0.021     2.570817    31.03576
              x5 |  -26.12149   5.360657    -4.87   0.000    -36.64568   -15.59729
              x6 |   17.35713   6.221222     2.79   0.005     5.143452    29.57081
              x7 |  -.0196991   .0065702    -3.00   0.003    -.0325979   -.0068003
                 |
           round |
              2  |  -5.637439   13.53217    -0.42   0.677    -32.20417    20.92929
              3  |   18.22192   13.81219     1.32   0.187     -8.89457     45.3384
              4  |   8.938925   14.32762     0.62   0.533    -19.18945     37.0673
              5  |   21.59928   15.05858     1.43   0.152    -7.964136     51.1627
              6  |    30.6742   15.82739     1.94   0.053    -.3985806    61.74698
              7  |   39.86921   16.84672     2.37   0.018     6.795267    72.94316
              8  |   38.67904   17.92418     2.16   0.031     3.489776     73.8683
              9  |   49.75817   19.20568     2.59   0.010     12.05305     87.4633
             10  |   62.44235   20.37182     3.07   0.002     22.44782    102.4369
             11  |   77.21779   21.85422     3.53   0.000     34.31298    120.1226
             12  |   84.49857   23.11455     3.66   0.000     39.11944    129.8777
             13  |   70.84038    24.4438     2.90   0.004     22.85164    118.8291
             14  |   89.78732   26.00041     3.45   0.001     38.74258    140.8321
             15  |   99.18013   27.38591     3.62   0.000     45.41534    152.9449
                 |
           _cons |   82.16068   12.48479     6.58   0.000     57.65019    106.6712
    ------------------------------------------------------------------------------

    This would lead to results which are mainly in line with my expectations, however, I think this model is not adequate. I can exclude a fixed effect model, because of my time-invariant regressors. So I continued with a random effect model:
    xtreg y x1 x2 x3 x4 x5 x6 x7 i.round, re

    Code:
    . xtreg y x1 x2 x3 x4 x5 x6 x7 i.round, re
    
    Random-effects GLS regression                   Number of obs     =        750
    Group variable: ID                              Number of groups  =         50
    
    R-sq:                                           Obs per group:
         within  = 0.0899                                         min =         15
         between = 0.0854                                         avg =       15.0
         overall = 0.0839                                         max =         15
    
                                                    Wald chi2(21)     =      71.09
    corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000
    
    ------------------------------------------------------------------------------
               y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
              x1 |  -10.65852   23.40097    -0.46   0.649    -56.52357    35.20653
              x2 |   3.675693   18.35401     0.20   0.841     -32.2975    39.64889
              x3 |  -1.981723   1.890282    -1.05   0.294    -5.686607    1.723162
              x4 |   4.738529   14.39209     0.33   0.742    -23.46945    32.94651
              x5 |  -20.83174   4.446905    -4.68   0.000    -29.54752   -12.11597
              x6 |   13.82984   4.788612     2.89   0.004     4.444334    23.21535
              x7 |  -.0045219   .0062786    -0.72   0.471    -.0168278     .007784
                 |
           round |
              2  |  -8.784793   10.26244    -0.86   0.392     -28.8988    11.32921
              3  |   10.69709   10.61675     1.01   0.314    -10.11135    31.50554
              4  |  -2.085918   11.20945    -0.19   0.852    -24.05604     19.8842
              5  |   6.444364   12.04317     0.54   0.593    -17.15981    30.04853
              6  |   11.80622   12.91437     0.91   0.361    -13.50549    37.11792
              7  |   17.79786   13.99831     1.27   0.204    -9.638315    45.23404
              8  |   12.25996   15.18692     0.81   0.420    -17.50585    42.02577
              9  |   19.01789   16.55381     1.15   0.251    -13.42699    51.46276
             10  |   27.30547   17.81164     1.53   0.125    -7.604696    62.21564
             11  |    37.9876   19.32481     1.97   0.049     .1116635    75.86354
             12  |   41.37377   20.62967     2.01   0.045     .9403665    81.80718
             13  |   24.32797   21.97341     1.11   0.268    -18.73912    67.39506
             14  |   39.35765   23.52152     1.67   0.094    -6.743691    85.45899
             15  |   45.02379   24.91886     1.81   0.071    -3.816283    93.86386
                 |
           _cons |   81.30505   24.22192     3.36   0.001     33.83095    128.7791
    -------------+----------------------------------------------------------------
         sigma_u |  45.331774
         sigma_e |  50.089668
             rho |  .45026176   (fraction of variance due to u_i)
    ------------------------------------------------------------------------------


    Then I conducted the Breusch and Pagan Lagrange- multiplier test with the stata command xttest0. The result indicates that I can reject the H0, thus I infer from that that the pooled OLS model is not appropriate and the random effect model is a better fit.

    Lastly, I used a random-effects tobit model (xttobit y x1 x2 x3 x4 x5 x6 x7 i.round, ul(300) ll(0) ) to incorporate the censoring of the dependent variable. The results are quite identical to the random effect model. So I would infer that the random-effect model or the random-effects tobit model is the most adequate model to analyse this dataset. However, that would also mean that most of the regressors are insignificant.

    I want to ask you whether my proceeding and my inference is correct or whether I made a mistake somewhere. I am thankful every comment!


  • #2
    Tobias:
    welcome to this forum.
    Your pooled OLS results are not reliable, as you standard errors were not clustered on -panelid-. Taken as it is, your pooled OLS is actually an OLS with 750 independent observations (which is not the case, as you have 50 different panels).
    Notwithstanding your concerns about time-invariant predictors, I would perform a -hausman- test to compare -fe- vs -re- specification (as it usually follows -xttest0-; see the example in -help xtest0-).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Dear Carlo,

      thank you for your response! I conducted the hausman test with the following result:

      Code:
      . hausman fe re
      
      Note: the rank of the differenced variance matrix (16) does not equal the number of coefficients being tested (17); be sure this is what
              you expect, or there may be problems computing the test.  Examine the output of your estimators for anything unexpected and
              possibly consider scaling your variables so that the coefficients are on a similar scale.
      
                       ---- Coefficients ----
                   |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
                   |       fe           re         Difference          S.E.
      -------------+----------------------------------------------------------------
                x5 |   -20.19145    -20.83174        .6402959        .5731315
                x6 |    13.53845     13.82984       -.2913932        .2110452
                x7 |   -.0023662    -.0045219        .0021557        .0014156
             round |
                2  |   -9.286331    -8.784793        -.501538        .0253312
                3  |    9.597223     10.69709       -1.099868        .6431579
                4  |    -3.69155    -2.085918       -1.605632        1.021576
                5  |    4.254581     6.444364       -2.189783        1.417601
                6  |    9.093235     11.80622       -2.712982        1.766573
                7  |    14.60423     17.79786       -3.193629        2.118984
                8  |    8.461372     12.25996       -3.798588        2.509239
                9  |     14.6075     19.01789       -4.410386        2.914544
               10  |    22.29404     27.30547       -5.011432        3.285128
               11  |    32.38269      37.9876        -5.60491        3.683311
               12  |    35.22432     41.37377       -6.149453        4.031606
               13  |    17.68822     24.32797        -6.63975        4.371876
               14  |    32.14943     39.35765       -7.208222        4.752474
               15  |    37.28797     45.02379       -7.735823        5.102445
      ------------------------------------------------------------------------------
                                 b = consistent under Ho and Ha; obtained from xtreg
                  B = inconsistent under Ha, efficient under Ho; obtained from xtreg
      
          Test:  Ho:  difference in coefficients not systematic
      
                       chi2(16) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                                =        5.41
                      Prob>chi2 =      0.9933
                      (V_b-V_B is not positive definite)
      I am not sure, whether I can simply use the Hausman test because the time invariant regressors are omitted in the FE model. So, I additionally conducted the Mundlak approach (https://blog.stata.com/2015/10/29/fixed-effects-or-random-effects-the-mundlak-approach/):

      Code:
      . quietly xtreg y x1 x2 x3 x4 mean_x5 mean_x6 mean_x7 i.round, vce(robust)
      
      . test mean_x5 mean_x6 mean_x7
      
       ( 1)  mean_x5 = 0
       ( 2)  mean_x6 = 0
       ( 3)  mean_x7 = 0
      
                 chi2(  3) =    7.63
               Prob > chi2 =    0.0543
      So this means I can use the RE model, correct?

      Comment

      Working...
      X