Testing for misspecification in panel regression models

Radhika Channanamchery

Join Date: Jun 2023

Posts: 42
#1

Testing for misspecification in panel regression models

04 Aug 2023, 01:36

Hi,

I would like to test the misspecification error using (ovtest or linktest) after estimating RE model.

My stata codes are given below: (Started with xtreg,fe and then tested for xttest3, which suggested the presence of heteroscedasticity. So I have used xtoverid and got the RE model as the preferred model. )

xtreg SI_Final shareofirr_final Share_urbanpop shareofnonagriareainga shareofscandst averagelandsize_ha sharemarginal numberofbanksper1000sqkm gddppercapitaRS populationdensitypersqkm rainfall meantemperature i.year,re robust

Random-effects GLS regression Number of obs = 108
Group variable: districtid Number of groups = 27

R-sq: within = 0.3362 Obs per group: min = 4
between = 0.3131 avg = 4.0
overall = 0.3151 max = 4

Wald chi2(14) = 73.29
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

(Std. Err. adjusted for 27 clusters in districtid)
------------------------------------------------------------------------------------------
| Robust
SI_Final | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
shareofirr_final | .0001311 .0000531 2.47 0.013 .0000271 .0002352
Share_urbanpop | -.000071 .000834 -0.09 0.932 -.0017056 .0015636
shareofnonagriareainga | -.0158636 .0075536 -2.10 0.036 -.0306684 -.0010589
shareofscandst | .0023731 .0018152 1.31 0.191 -.0011845 .0059308
averagelandsize_ha | -.0024828 .0359169 -0.07 0.945 -.0728786 .067913
sharemarginal | .0010549 .0007419 1.42 0.155 -.0003991 .0025089
numberofbanksper1000sqkm | .0006007 .0003625 1.66 0.097 -.0001097 .0013111
gddppercapitaRS | -4.19e-07 1.07e-07 -3.91 0.000 -6.29e-07 -2.09e-07
populationdensitypersqkm | .0001655 .0000661 2.50 0.012 .000036 .000295
rainfall | -9.77e-06 9.13e-06 -1.07 0.285 -.0000277 8.13e-06
meantemperature | -.0032167 .0050669 -0.63 0.526 -.0131477 .0067143
|
year |
2006 | .0044992 .0131529 0.34 0.732 -.0212801 .0302784
2011 | .0391637 .019784 1.98 0.048 .0003877 .0779397
2016 | .0402596 .0270659 1.49 0.137 -.0127887 .0933078
|
_cons | .7324753 .1512481 4.84 0.000 .4360344 1.028916
-------------------------+----------------------------------------------------------------
sigma_u | .07243071
sigma_e | .03383597
rho | .82086397 (fraction of variance due to u_i)
------------------------------------------------------------------------------------------

Then I used manual commands to test omitted variable bias in the RE model.

predict fit, xbu

gen fitt_2=fit^2

. gen fitt_3=fit^3

. gen fitt_4=fit^4

After that I run the RE model included with these as variables

xtreg SI_Final shareofirr_final Share_urbanpop shareofnonagriareainga shareofscandst averagelandsize_ha sharemarginal numberofbanksper1
> 000sqkm gddppercapitaRS populationdensitypersqkm rainfall meantemperature fitt_2 fitt_3 fitt_4 i.year,re robust

Random-effects GLS regression Number of obs = 108
Group variable: districtid Number of groups = 27

R-sq: within = 0.3899 Obs per group: min = 4
between = 0.9973 avg = 4.0
overall = 0.9220 max = 4

Wald chi2(17) = 36580.52
corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000

(Std. Err. adjusted for 27 clusters in districtid)
------------------------------------------------------------------------------------------
| Robust
SI_Final | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------------------+----------------------------------------------------------------
shareofirr_final | .0000161 .0000492 0.33 0.744 -.0000804 .0001126
Share_urbanpop | -.0000547 .0001708 -0.32 0.749 -.0003895 .0002801
shareofnonagriareainga | .0009251 .0008492 1.09 0.276 -.0007394 .0025895
shareofscandst | .0000777 .0002727 0.29 0.776 -.0004567 .0006122
averagelandsize_ha | -.0071001 .0110872 -0.64 0.522 -.0288305 .0146304
sharemarginal | -.0003738 .0005034 -0.74 0.458 -.0013603 .0006128
numberofbanksper1000sqkm | -.0000579 .0001949 -0.30 0.766 -.00044 .0003242
gddppercapitaRS | 9.55e-08 7.09e-08 1.35 0.178 -4.34e-08 2.34e-07
populationdensitypersqkm | -.0000119 8.11e-06 -1.47 0.142 -.0000278 3.99e-06
rainfall | 3.21e-07 3.44e-06 0.09 0.926 -6.41e-06 7.06e-06
meantemperature | -.0001208 .0008875 -0.14 0.892 -.0018602 .0016186
fitt_2 | 11.67344 6.944275 1.68 0.093 -1.937085 25.28397
fitt_3 | -22.06439 14.91381 -1.48 0.139 -51.29491 7.166136
fitt_4 | 12.4602 8.881179 1.40 0.161 -4.946595 29.86699
|
year |
2006 | -.0013051 .0111626 -0.12 0.907 -.0231833 .0205732
2011 | -.009395 .0147289 -0.64 0.524 -.0382633 .0194732
2016 | -.01071 .0174531 -0.61 0.539 -.0449175 .0234976
|
_cons | -.4272979 .4024529 -1.06 0.288 -1.216091 .3614953
-------------------------+----------------------------------------------------------------
sigma_u | 0
sigma_e | .03162606
rho | 0 (fraction of variance due to u_i)
------------------------------------------------------------------------------------------

. test fitt_2 fitt_3 fitt_4

( 1) fitt_2 = 0
( 2) fitt_3 = 0
( 3) fitt_4 = 0

chi2( 3) = 4892.64
Prob > chi2 = 0.0000

.
Q1.As the p-value is significant, I have to reject the H0 of no omitted variable bias assumption. Is that the correct interpretation?

Q2: Misspecification and omitted variable bias the same? Are there any other tests to be checked with? like linktest?

Q3: Ovtest was not working with xtreg commands. So should I try to regress along with i.panelid in the RHS of the model?

I would appreciate it if someone could kindly help me with the above queries.

Thank you

Radhika C
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#2

04 Aug 2023, 02:07

Radhika:
the main issue with your model seems to rest on having too many parameters for 108 observations only. Being more parsimonious is something to consider.
As far as you Qs are concerned:
1) Correct. Your model seems to be misspecified.
2) Yes, they basically mean the same thing and you actually perfomed -linktest- by hand (-linktest- stops at sq_fitted, though):
3) No, as per 2).

Kind regards,
Carlo
(Stata 19.0)
Comment
Radhika Channanamchery

Join Date: Jun 2023

Posts: 42
#3

04 Aug 2023, 03:13

Originally posted by Carlo Lazzaro View Post

Radhika:
the main issue with your model seems to rest on having too many parameters for 108 observations only. Being more parsimonious is something to consider.
As far as you Qs are concerned:
1) Correct. Your model seems to be misspecified.
2) Yes, they basically mean the same thing and you actually perfomed -linktest- by hand (-linktest- stops at sq_fitted, though):
3) No, as per 2).

Dear Carlo, Thank you so much for your kind reply.

I have one more question here.

.How do we correct the omitted variable bias or misspecification of the model?

Is it ok to reduce the number of independent variables (only include the most important ones) in the model and see?

Thank you

Radhika
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17714
#4

04 Aug 2023, 03:56

Radhika:
yes, it is "ok to reduce the number of independent variables (only include the most important ones) in the model and see", as you stated.

Kind regards,
Carlo
(Stata 19.0)
Comment

Announcement

Testing for misspecification in panel regression models

Comment

Comment

Comment