Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Endogeneity in Panel Data Analysis

    Dear All,

    I am working on an impact evaluation using STATA 13.1 (last update) on OSX (last update) and my dataset is a panel data (unbalanced) with two time periods: Baseline (2002) where nobody applied to the program and Follow-up (2010) where some applied and some other have not.

    Code:
    
    . tab MGNREGA adj_round
    
       MGNREGA |    Adjusted Round
     Treatment | 2002 - Ba  2010 - Fo |     Total
    -----------+----------------------+----------
            No |     1,008        399 |     1,407 
           Yes |         0        576 |       576 
    -----------+----------------------+----------
         Total |     1,008        975 |     1,983
    The program was designed to be universal, hence everybody can apply, which makes impact estimation tricky due to unobservable characteristics.
    The assessment will be in terms of multidimensional poverty. My dependent variable will be my poverty estimation with cut-off at K=4 (hence m0_4).

    Before posting here I read as much literature as possible and I have followed Prof. Woolridge's guide here: http://www.ifs.org.uk/docs/wooldridge%20session%204.pdf

    However, when I try to run the - xtivreg2- command:

    Code:
    xtivreg2 m0_4 typesite region careage caredu (MGNREGA = mch*), fe endog(MGNREGA)
    STATA return the following error: "no observation".

    If I try to run a simple -xtivreg- command:

    Code:
     xtivreg m0_4 typesite region careage caredu (MGNREGA = mch*), fe
    STATA returns the following error:

    Code:
    the sample specifies cross-sectional data, xtivreg is not designed for cross-sectional data, use ivregress with cross-sectional data, r(498);
    I did tsset my panel:

    Code:
    . tsset
           panel variable:  childid (unbalanced)
            time variable:  adj_round, 0 to 1
                    delta:  1 unit
    I am I doing something wrong?

    Thank you very much.

    Alberto

  • #2
    I don't know anything about the user-written command, but does MGNREGA happen to be a string variable? In my experience, feeding a string variable to a command that's expecting a numeric is a frequent reason for the "No observations" error message. Another is if one variable (or a combination of predictor variables) is all-missing, and so you might want to check about that, too.

    Comment


    • #3
      Hi Joseph, thank you for your reply.

      MGNREGA is not a string variable but a dummy that takes the value of 0 in the case of "No treatment" and 1 in the case of "Treatment". It doesn't have any missing values.

      Comment


      • #4
        i have three equations in my model . i want to apply panel 3sls but before this i have to check simultaneity bias . how to check simultaneity bias in these equations? is there any test is available to check simultaneity?

        Comment


        • #5
          Alberto: What are the mch* variables you are using as instruments? You might want to list some of the variables, especially the childid and adj_round variables, to make sure you really have this set up as a panel.

          Comment


          • #6
            Adeeba: I am sorry I don't have an answer to your question. I am sure somebody else will be able to give you a proper answer.

            Jeff: the mch* variables I am using as instruments are a set of variables I created following a methodology I found on the book Handbook of Impact Evaluation (Khander, Koolwal & Samad, 2010). In my scenario, program placement in the area may be endogenous but household eligibility is not. Therefore, a combination of these factors should be exogenous.

            In order to apply an individual needs to be an adult, living in rural areas. I used the following code to create the set of mch*

            Code:
            egen hhMGNREGA=max(MGNREGA), by(childid)
            gen hhchoice = hhMGNREGA == 1 & typesite == 2 & headage >= 18
            
            * The I interact with a number of variables
            
            for var hhsize headage headsex headedu careage caresex caredu dadage momage dadedu momedu dadlive momlive dadlit momlit sccorp stcorp bccorp mincorp: gen mchX=hhchoice*X
            I will add childid and adj_round in the regression command and try again. Thank you very much!

            Comment


            • #7
              Dear All,

              I was able to remove the instrument creating the "no observation" error.

              I'd like to ask you if my interpretation of the -xtivreg2- output of the output result is correct:

              Code:
              xtivreg2 m0_4 typesite region headage headedu careage caredu (MGNREGA = mchhhsize mchheadage mchheadsex mchheadedu mchcareage mchcaresex mchcaredu), fe gmm2s cluster(childid) endog(MGNREGA)
              Code:
              Underidentification test (Kleibergen-Paap rk LM statistic):            384.579
                                                                 Chi-sq(7) P-val =    0.0000
              ------------------------------------------------------------------------------
              Weak identification test (Cragg-Donald Wald F statistic):              175.814
                                       (Kleibergen-Paap rk Wald F statistic):        109.790
              Stock-Yogo weak ID test critical values:  5% maximal IV relative bias    19.86
                                                       10% maximal IV relative bias    11.29
                                                       20% maximal IV relative bias     6.73
                                                       30% maximal IV relative bias     5.07
                                                       10% maximal IV size             31.50
                                                       15% maximal IV size             17.38
                                                       20% maximal IV size             12.48
                                                       25% maximal IV size              9.93
              Source: Stock-Yogo (2005).  Reproduced by permission.
              NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
              ------------------------------------------------------------------------------
              Hansen J statistic (overidentification test of all instruments):         2.399
                                                                 Chi-sq(6) P-val =    0.8796
              -endog- option:
              Endogeneity test of endogenous regressors:                               7.705
                                                                 Chi-sq(1) P-val =    0.0055
              Regressors tested:    MGNREGA
              • According to the "Underidentification Test", I can reject the hypothesis that my model is underidentified (hence, it is identified);
              • According to the "Overidentification Test", I cannot reject the null that my overidentification is valid.
              Which in turns means that my IV estimation cannot be supported. Am I right?

              Comment


              • #8
                Originally posted by Alberto Gualtieri View Post
                Dear All,

                I was able to remove the instrument creating the "no observation" error.

                I'd like to ask you if my interpretation of the -xtivreg2- output of the output result is correct:

                Code:
                xtivreg2 m0_4 typesite region headage headedu careage caredu (MGNREGA = mchhhsize mchheadage mchheadsex mchheadedu mchcareage mchcaresex mchcaredu), fe gmm2s cluster(childid) endog(MGNREGA)
                Code:
                Underidentification test (Kleibergen-Paap rk LM statistic): 384.579
                Chi-sq(7) P-val = 0.0000
                ------------------------------------------------------------------------------
                Weak identification test (Cragg-Donald Wald F statistic): 175.814
                (Kleibergen-Paap rk Wald F statistic): 109.790
                Stock-Yogo weak ID test critical values: 5% maximal IV relative bias 19.86
                10% maximal IV relative bias 11.29
                20% maximal IV relative bias 6.73
                30% maximal IV relative bias 5.07
                10% maximal IV size 31.50
                15% maximal IV size 17.38
                20% maximal IV size 12.48
                25% maximal IV size 9.93
                Source: Stock-Yogo (2005). Reproduced by permission.
                NB: Critical values are for Cragg-Donald F statistic and i.i.d. errors.
                ------------------------------------------------------------------------------
                Hansen J statistic (overidentification test of all instruments): 2.399
                Chi-sq(6) P-val = 0.8796
                -endog- option:
                Endogeneity test of endogenous regressors: 7.705
                Chi-sq(1) P-val = 0.0055
                Regressors tested: MGNREGA
                • According to the "Underidentification Test", I can reject the hypothesis that my model is underidentified (hence, it is identified);
                • According to the "Overidentification Test", I cannot reject the null that my overidentification is valid.
                Which in turns means that my IV estimation cannot be supported. Am I right?

                So why do you want over-identification? I think if your model is adequately identified, should it not be enough if it is not under-identified? What happened to this? I have the same cross section message with xtivreg right now and I want to see how yours go?

                Comment

                Working...
                X