Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • method for omitted variable bias cross sectional analysis

    Dear all,
    I have a cross sectional analysis to analyse the effect of head circumference (continuous variable) on cognititve skills (continuous variable). Can someone please suggest a method i can use to solve for omitted variable bias. I will be really grateful.

  • #2
    If I understand right, you may wish to take a look at the Ramsey reset test. Just type - help estat ovtest - and check it out.
    Best regards,

    Marcos

    Comment


    • #3
      Thanks for your reply. So i am trying to solve endogeneity issues and i would have used instrumental variable analysis but i do not have an instrument and thats why i am asking for aother method that can solve endogeneity issues (omitted variable bias) when both dependent and independet variable are continuous.

      Comment


      • #4
        Adeola: Unfortunately, you're asking for the impossible. In fact, your setup is very unusual for even thinking of IV estimation. Usually, IV estimation arises when the key explanatory variable can be influenced by economic agents. For example, people choose whether to participate in a job training program. Or they choose their level of education. A principal decides how many students to put in a class. And so on.

        I guess that's irrelevant since a valid IV is difficult to imagine. What other control variables do you have? Any?

        BTW, I've said this a few times before and I'll say it again: Despite the fact that one obtains RESET using -estat overid-, RESET is not a valid test for omitted variables. As I showed in some work many years ago, one will pass the test of the OV is linearly related to to included variable. Failing the test means you might have a functional form problem, which is easily handled by putting in squares and so on. RESET is a good functional form test, but that's it.

        Comment


        • #5
          Originally posted by Jeff Wooldridge View Post
          RESET is not a valid test for omitted variables.
          This cannot be stressed enough. I find it very unfortunate that Stata output for the H0 of the test seems to suggest otherwise. Here is a simple example demonstrating just how useless the RESET is to test for "omitted variables"

          Code:
          // make the test reproducible
          version 11.2
          set seed 42
          
          // create toy data data
          clear
          matrix C = 1, .8\ .8, 1
          corr2data x z , corr(C) n (10000)
          
          // create the real world: y = 1*x + 1*z + error
          generate y = x + z + rnormal()
          
          // get the unbiased estimates for x and z
          regress y x z
          estat ovtest
          
          // now omit z
          regress y x
          estat ovtest
          The above yields

          Code:
          ...
          . // get the unbiased estimates for x and z
          . regress y x z
          
                Source |       SS       df       MS              Number of obs =   10000
          -------------+------------------------------           F(  2,  9997) =17728.69
                 Model |   35903.563     2  17951.7815           Prob > F      =  0.0000
              Residual |  10122.7981  9997  1.01258358           R-squared     =  0.7801
          -------------+------------------------------           Adj R-squared =  0.7800
                 Total |   46026.361  9999  4.60309641           Root MSE      =  1.0063
          
          ------------------------------------------------------------------------------
                     y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                     x |   1.011109    .016772    60.29   0.000     .9782329    1.043986
                     z |   .9862927    .016772    58.81   0.000     .9534161    1.019169
                 _cons |   .0051587   .0100627     0.51   0.608    -.0145662    .0248837
          ------------------------------------------------------------------------------
          
          . estat ovtest
          
          Ramsey RESET test using powers of the fitted values of y
                 Ho:  model has no omitted variables
                          F(3, 9994) =      0.57
                            Prob > F =      0.6352
          
          .
          . // now omit z
          . regress y x
          
                Source |       SS       df       MS              Number of obs =   10000
          -------------+------------------------------           F(  1,  9998) =23777.47
                 Model |  32401.9294     1  32401.9294           Prob > F      =  0.0000
              Residual |  13624.4316  9998   1.3627157           R-squared     =  0.7040
          -------------+------------------------------           Adj R-squared =  0.7040
                 Total |   46026.361  9999  4.60309641           Root MSE      =  1.1674
          
          ------------------------------------------------------------------------------
                     y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
          -------------+----------------------------------------------------------------
                     x |   1.800144   .0116741   154.20   0.000      1.77726    1.823027
                 _cons |   .0051587   .0116735     0.44   0.659    -.0177238    .0280412
          ------------------------------------------------------------------------------
          
          . estat ovtest
          
          Ramsey RESET test using powers of the fitted values of y
                 Ho:  model has no omitted variables
                          F(3, 9995) =      0.31
                            Prob > F =      0.8169
          Best
          Daniel
          Last edited by daniel klein; 29 Nov 2019, 11:49.

          Comment


          • #6
            Adeola:
            as an aside to previous excellent replies, I find really have to believe that your data generating process includes one predictor only.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Thanks for your reply. My model is actually:
              Cogskilli= β0+ β1 HCi + β2 Xi+ β3 Zi+ εi …………….…….……. (1)
              Where Cogskill represents cognitive skills in childhood, HCi represents Head circumference, Xi represents respondent’s characteristics (sex, social class, birth weight, days read to), Zi represents respondents’ parental characteristics (mothers age, mother’s smoking habit, mother’s education and father’s education). I however do not have data on parental behaviour which could influence both HC and Cognititve skill of the child. This is why i wanted a method that can solve for endogeneity (omitted variable bias).

              Comment


              • #8
                You're controlling for some of the mother's habits, so I'm not sure what other "behavior" you're thinking about that would affect HC. Maybe drinking during pregnancy? In any case, you might want to do a sensitivity analysis. You can essentially see how the estimate of β1 changes as you allow different amounts of correlation between two error terms. I have an example of this somewhere.

                Comment

                Working...
                X