Endogeneity in Panel Data Analysis

Alberto Gualtieri

Join Date: Feb 2016

Posts: 5
#1

Endogeneity in Panel Data Analysis

14 Aug 2017, 00:12

Dear All,

I am working on an impact evaluation using STATA 13.1 (last update) on OSX (last update) and my dataset is a panel data (unbalanced) with two time periods: Baseline (2002) where nobody applied to the program and Follow-up (2010) where some applied and some other have not.

Code:

. tab MGNREGA adj_round MGNREGA | Adjusted Round Treatment | 2002 - Ba 2010 - Fo | Total -----------+----------------------+---------- No | 1,008 399 | 1,407 Yes | 0 576 | 576 -----------+----------------------+---------- Total | 1,008 975 | 1,983

The program was designed to be universal, hence everybody can apply, which makes impact estimation tricky due to unobservable characteristics.
The assessment will be in terms of multidimensional poverty. My dependent variable will be my poverty estimation with cut-off at K=4 (hence m0_4).

Before posting here I read as much literature as possible and I have followed Prof. Woolridge's guide here: http://www.ifs.org.uk/docs/wooldridge%20session%204.pdf

However, when I try to run the - xtivreg2- command:

Code:

xtivreg2 m0_4 typesite region careage caredu (MGNREGA = mch*), fe endog(MGNREGA)

STATA return the following error: "no observation".

If I try to run a simple -xtivreg- command:

Code:

xtivreg m0_4 typesite region careage caredu (MGNREGA = mch*), fe

STATA returns the following error:

Code:

the sample specifies cross-sectional data, xtivreg is not designed for cross-sectional data, use ivregress with cross-sectional data, r(498);

I did tsset my panel:

Code:

. tsset panel variable: childid (unbalanced) time variable: adj_round, 0 to 1 delta: 1 unit

I am I doing something wrong?

Thank you very much.

Alberto
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4449
#2

14 Aug 2017, 00:53

I don't know anything about the user-written command, but does MGNREGA happen to be a string variable? In my experience, feeding a string variable to a command that's expecting a numeric is a frequent reason for the "No observations" error message. Another is if one variable (or a combination of predictor variables) is all-missing, and so you might want to check about that, too.
Comment
Alberto Gualtieri

Join Date: Feb 2016

Posts: 5
#3

14 Aug 2017, 12:21

Hi Joseph, thank you for your reply.

MGNREGA is not a string variable but a dummy that takes the value of 0 in the case of "No treatment" and 1 in the case of "Treatment". It doesn't have any missing values.
Comment
adeeba sarwar

Join Date: Aug 2017

Posts: 4
#4

14 Aug 2017, 12:30

i have three equations in my model . i want to apply panel 3sls but before this i have to check simultaneity bias . how to check simultaneity bias in these equations? is there any test is available to check simultaneity?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2204
#5

14 Aug 2017, 14:08

Alberto: What are the mch* variables you are using as instruments? You might want to list some of the variables, especially the childid and adj_round variables, to make sure you really have this set up as a panel.
Comment
Alberto Gualtieri

Join Date: Feb 2016

Posts: 5
#6

14 Aug 2017, 23:29

Adeeba: I am sorry I don't have an answer to your question. I am sure somebody else will be able to give you a proper answer.

Jeff: the mch* variables I am using as instruments are a set of variables I created following a methodology I found on the book Handbook of Impact Evaluation (Khander, Koolwal & Samad, 2010). In my scenario, program placement in the area may be endogenous but household eligibility is not. Therefore, a combination of these factors should be exogenous.

In order to apply an individual needs to be an adult, living in rural areas. I used the following code to create the set of mch*

Code:

egen hhMGNREGA=max(MGNREGA), by(childid) gen hhchoice = hhMGNREGA == 1 & typesite == 2 & headage >= 18 * The I interact with a number of variables for var hhsize headage headsex headedu careage caresex caredu dadage momage dadedu momedu dadlive momlive dadlit momlit sccorp stcorp bccorp mincorp: gen mchX=hhchoice*X

I will add childid and adj_round in the regression command and try again. Thank you very much!
Comment

Alberto Gualtieri

Join Date: Feb 2016
Posts: 5

15 Aug 2017, 07:24

Dear All,

I was able to remove the instrument creating the "no observation" error.

I'd like to ask you if my interpretation of the -xtivreg2- output of the output result is correct:

Code:

xtivreg2 m0_4 typesite region headage headedu careage caredu (MGNREGA = mchhhsize mchheadage mchheadsex mchheadedu mchcareage mchcaresex mchcaredu), fe gmm2s cluster(childid) endog(MGNREGA)

Code:

Underidentification test (Kleibergen-Paap rk LM statistic):            384.579
                                                   Chi-sq(7) P-val =    0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic):              175.814
                         (Kleibergen-Paap rk Wald F statistic):        109.790
Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    19.86
                                         10% maximal IV relative bias    11.29
                                         20% maximal IV relative bias     6.73
                                         30% maximal IV relative bias     5.07
                                         10% maximal IV size             31.50
                                         15% maximal IV size             17.38
                                         20% maximal IV size             12.48
                                         25% maximal IV size              9.93
Source: Stock-Yogo (2005).  Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments):         2.399
                                                   Chi-sq(6) P-val =    0.8796
-endog- option:
Endogeneity test of endogenous regressors:                               7.705
                                                   Chi-sq(1) P-val =    0.0055
Regressors tested:    MGNREGA

According to the "Underidentification Test", I can reject the hypothesis that my model is underidentified (hence, it is identified);
According to the "Overidentification Test", I cannot reject the null that my overidentification is valid.

Which in turns means that my IV estimation cannot be supported. Am I right?

Comment

Umair Ali

Join Date: Mar 2018
Posts: 17

19 Feb 2019, 13:47

Originally posted by Alberto Gualtieri View Post

Dear All,

I was able to remove the instrument creating the "no observation" error.

I'd like to ask you if my interpretation of the -xtivreg2- output of the output result is correct:

Code:

xtivreg2 m0_4 typesite region headage headedu careage caredu (MGNREGA = mchhhsize mchheadage mchheadsex mchheadedu mchcareage mchcaresex mchcaredu), fe gmm2s cluster(childid) endog(MGNREGA)

Code:

Underidentification test (Kleibergen-Paap rk LM statistic): 384.579
Chi-sq(7) P-val = 0.0000
------------------------------------------------------------------------------
Weak identification test (Cragg-Donald Wald F statistic): 175.814
(Kleibergen-Paap rk Wald F statistic): 109.790
Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 19.86
10% maximal IV relative bias 11.29
20% maximal IV relative bias 6.73
30% maximal IV relative bias 5.07
10% maximal IV size 31.50
15% maximal IV size 17.38
20% maximal IV size 12.48
25% maximal IV size 9.93
Source: Stock-Yogo (2005). Reproduced by permission.
NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
------------------------------------------------------------------------------
Hansen J statistic (overidentification test of all instruments): 2.399
Chi-sq(6) P-val = 0.8796
-endog- option:
Endogeneity test of endogenous regressors: 7.705
Chi-sq(1) P-val = 0.0055
Regressors tested: MGNREGA

According to the "Underidentification Test", I can reject the hypothesis that my model is underidentified (hence, it is identified);
According to the "Overidentification Test", I cannot reject the null that my overidentification is valid.

Which in turns means that my IV estimation cannot be supported. Am I right?

So why do you want over-identification? I think if your model is adequately identified, should it not be enough if it is not under-identified? What happened to this? I have the same cross section message with xtivreg right now and I want to see how yours go?

Announcement