Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Testing for misspecification in panel regression models

    Hi,

    I would like to test the misspecification error using (ovtest or linktest) after estimating RE model.


    My stata codes are given below: (Started with xtreg,fe and then tested for xttest3, which suggested the presence of heteroscedasticity. So I have used xtoverid and got the RE model as the preferred model. )


    xtreg SI_Final shareofirr_final Share_urbanpop shareofnonagriareainga shareofscandst averagelandsize_ha sharemarginal numberofbanksper1000sqkm gddppercapitaRS populationdensitypersqkm rainfall meantemperature i.year,re robust



    Random-effects GLS regression Number of obs = 108
    Group variable: districtid Number of groups = 27

    R-sq: within = 0.3362 Obs per group: min = 4
    between = 0.3131 avg = 4.0
    overall = 0.3151 max = 4

    Wald chi2(14) = 73.29
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    (Std. Err. adjusted for 27 clusters in districtid)
    ------------------------------------------------------------------------------------------
    | Robust
    SI_Final | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    shareofirr_final | .0001311 .0000531 2.47 0.013 .0000271 .0002352
    Share_urbanpop | -.000071 .000834 -0.09 0.932 -.0017056 .0015636
    shareofnonagriareainga | -.0158636 .0075536 -2.10 0.036 -.0306684 -.0010589
    shareofscandst | .0023731 .0018152 1.31 0.191 -.0011845 .0059308
    averagelandsize_ha | -.0024828 .0359169 -0.07 0.945 -.0728786 .067913
    sharemarginal | .0010549 .0007419 1.42 0.155 -.0003991 .0025089
    numberofbanksper1000sqkm | .0006007 .0003625 1.66 0.097 -.0001097 .0013111
    gddppercapitaRS | -4.19e-07 1.07e-07 -3.91 0.000 -6.29e-07 -2.09e-07
    populationdensitypersqkm | .0001655 .0000661 2.50 0.012 .000036 .000295
    rainfall | -9.77e-06 9.13e-06 -1.07 0.285 -.0000277 8.13e-06
    meantemperature | -.0032167 .0050669 -0.63 0.526 -.0131477 .0067143
    |
    year |
    2006 | .0044992 .0131529 0.34 0.732 -.0212801 .0302784
    2011 | .0391637 .019784 1.98 0.048 .0003877 .0779397
    2016 | .0402596 .0270659 1.49 0.137 -.0127887 .0933078
    |
    _cons | .7324753 .1512481 4.84 0.000 .4360344 1.028916
    -------------------------+----------------------------------------------------------------
    sigma_u | .07243071
    sigma_e | .03383597
    rho | .82086397 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------------



    Then I used manual commands to test omitted variable bias in the RE model.


    predict fit, xbu



    gen fitt_2=fit^2

    . gen fitt_3=fit^3

    . gen fitt_4=fit^4

    After that I run the RE model included with these as variables



    xtreg SI_Final shareofirr_final Share_urbanpop shareofnonagriareainga shareofscandst averagelandsize_ha sharemarginal numberofbanksper1
    > 000sqkm gddppercapitaRS populationdensitypersqkm rainfall meantemperature fitt_2 fitt_3 fitt_4 i.year,re robust

    Random-effects GLS regression Number of obs = 108
    Group variable: districtid Number of groups = 27

    R-sq: within = 0.3899 Obs per group: min = 4
    between = 0.9973 avg = 4.0
    overall = 0.9220 max = 4

    Wald chi2(17) = 36580.52
    corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

    (Std. Err. adjusted for 27 clusters in districtid)
    ------------------------------------------------------------------------------------------
    | Robust
    SI_Final | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    -------------------------+----------------------------------------------------------------
    shareofirr_final | .0000161 .0000492 0.33 0.744 -.0000804 .0001126
    Share_urbanpop | -.0000547 .0001708 -0.32 0.749 -.0003895 .0002801
    shareofnonagriareainga | .0009251 .0008492 1.09 0.276 -.0007394 .0025895
    shareofscandst | .0000777 .0002727 0.29 0.776 -.0004567 .0006122
    averagelandsize_ha | -.0071001 .0110872 -0.64 0.522 -.0288305 .0146304
    sharemarginal | -.0003738 .0005034 -0.74 0.458 -.0013603 .0006128
    numberofbanksper1000sqkm | -.0000579 .0001949 -0.30 0.766 -.00044 .0003242
    gddppercapitaRS | 9.55e-08 7.09e-08 1.35 0.178 -4.34e-08 2.34e-07
    populationdensitypersqkm | -.0000119 8.11e-06 -1.47 0.142 -.0000278 3.99e-06
    rainfall | 3.21e-07 3.44e-06 0.09 0.926 -6.41e-06 7.06e-06
    meantemperature | -.0001208 .0008875 -0.14 0.892 -.0018602 .0016186
    fitt_2 | 11.67344 6.944275 1.68 0.093 -1.937085 25.28397
    fitt_3 | -22.06439 14.91381 -1.48 0.139 -51.29491 7.166136
    fitt_4 | 12.4602 8.881179 1.40 0.161 -4.946595 29.86699
    |
    year |
    2006 | -.0013051 .0111626 -0.12 0.907 -.0231833 .0205732
    2011 | -.009395 .0147289 -0.64 0.524 -.0382633 .0194732
    2016 | -.01071 .0174531 -0.61 0.539 -.0449175 .0234976
    |
    _cons | -.4272979 .4024529 -1.06 0.288 -1.216091 .3614953
    -------------------------+----------------------------------------------------------------
    sigma_u | 0
    sigma_e | .03162606
    rho | 0 (fraction of variance due to u_i)
    ------------------------------------------------------------------------------------------

    . test fitt_2 fitt_3 fitt_4

    ( 1) fitt_2 = 0
    ( 2) fitt_3 = 0
    ( 3) fitt_4 = 0

    chi2( 3) = 4892.64
    Prob > chi2 = 0.0000

    .
    Q1.As the p-value is significant, I have to reject the H0 of no omitted variable bias assumption. Is that the correct interpretation?

    Q2: Misspecification and omitted variable bias the same? Are there any other tests to be checked with? like linktest?

    Q3: Ovtest was not working with xtreg commands. So should I try to regress along with i.panelid in the RHS of the model?


    I would appreciate it if someone could kindly help me with the above queries.


    Thank you

    Radhika C


  • #2
    Radhika:
    the main issue with your model seems to rest on having too many parameters for 108 observations only. Being more parsimonious is something to consider.
    As far as you Qs are concerned:
    1) Correct. Your model seems to be misspecified.
    2) Yes, they basically mean the same thing and you actually perfomed -linktest- by hand (-linktest- stops at sq_fitted, though):
    3) No, as per 2).
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Originally posted by Carlo Lazzaro View Post
      Radhika:
      the main issue with your model seems to rest on having too many parameters for 108 observations only. Being more parsimonious is something to consider.
      As far as you Qs are concerned:
      1) Correct. Your model seems to be misspecified.
      2) Yes, they basically mean the same thing and you actually perfomed -linktest- by hand (-linktest- stops at sq_fitted, though):
      3) No, as per 2).
      Dear Carlo, Thank you so much for your kind reply.

      I have one more question here.

      .How do we correct the omitted variable bias or misspecification of the model?

      Is it ok to reduce the number of independent variables (only include the most important ones) in the model and see?

      Thank you

      Radhika


      Comment


      • #4
        Radhika:
        yes, it is "ok to reduce the number of independent variables (only include the most important ones) in the model and see", as you stated.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X