method for omitted variable bias cross sectional analysis

Adeola Kolade

Join Date: Aug 2017

Posts: 76
#1

method for omitted variable bias cross sectional analysis

29 Nov 2019, 04:54

Dear all,
I have a cross sectional analysis to analyse the effect of head circumference (continuous variable) on cognititve skills (continuous variable). Can someone please suggest a method i can use to solve for omitted variable bias. I will be really grateful.
Tags: None
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#2

29 Nov 2019, 05:01

If I understand right, you may wish to take a look at the Ramsey reset test. Just type - help estat ovtest - and check it out.

Best regards,

Marcos
Comment
Adeola Kolade

Join Date: Aug 2017

Posts: 76
#3

29 Nov 2019, 05:57

Thanks for your reply. So i am trying to solve endogeneity issues and i would have used instrumental variable analysis but i do not have an instrument and thats why i am asking for aother method that can solve endogeneity issues (omitted variable bias) when both dependent and independet variable are continuous.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2167
#4

29 Nov 2019, 10:33

Adeola: Unfortunately, you're asking for the impossible. In fact, your setup is very unusual for even thinking of IV estimation. Usually, IV estimation arises when the key explanatory variable can be influenced by economic agents. For example, people choose whether to participate in a job training program. Or they choose their level of education. A principal decides how many students to put in a class. And so on.

I guess that's irrelevant since a valid IV is difficult to imagine. What other control variables do you have? Any?

BTW, I've said this a few times before and I'll say it again: Despite the fact that one obtains RESET using -estat overid-, RESET is not a valid test for omitted variables. As I showed in some work many years ago, one will pass the test of the OV is linearly related to to included variable. Failing the test means you might have a functional form problem, which is easily handled by putting in squares and so on. RESET is a good functional form test, but that's it.
1 like
Comment

daniel klein

Join Date: Mar 2014
Posts: 3850

29 Nov 2019, 11:46

Originally posted by Jeff Wooldridge View Post

RESET is not a valid test for omitted variables.

This cannot be stressed enough. I find it very unfortunate that Stata output for the H0 of the test seems to suggest otherwise. Here is a simple example demonstrating just how useless the RESET is to test for "omitted variables"

Code:

// make the test reproducible
version 11.2
set seed 42

// create toy data data
clear
matrix C = 1, .8\ .8, 1
corr2data x z , corr(C) n (10000)

// create the real world: y = 1*x + 1*z + error
generate y = x + z + rnormal()

// get the unbiased estimates for x and z
regress y x z
estat ovtest

// now omit z
regress y x
estat ovtest

The above yields

Code:

...
. // get the unbiased estimates for x and z
. regress y x z

      Source |       SS       df       MS              Number of obs =   10000
-------------+------------------------------           F(  2,  9997) =17728.69
       Model |   35903.563     2  17951.7815           Prob > F      =  0.0000
    Residual |  10122.7981  9997  1.01258358           R-squared     =  0.7801
-------------+------------------------------           Adj R-squared =  0.7800
       Total |   46026.361  9999  4.60309641           Root MSE      =  1.0063

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   1.011109    .016772    60.29   0.000     .9782329    1.043986
           z |   .9862927    .016772    58.81   0.000     .9534161    1.019169
       _cons |   .0051587   .0100627     0.51   0.608    -.0145662    .0248837
------------------------------------------------------------------------------

. estat ovtest

Ramsey RESET test using powers of the fitted values of y
       Ho:  model has no omitted variables
                F(3, 9994) =      0.57
                  Prob > F =      0.6352

.
. // now omit z
. regress y x

      Source |       SS       df       MS              Number of obs =   10000
-------------+------------------------------           F(  1,  9998) =23777.47
       Model |  32401.9294     1  32401.9294           Prob > F      =  0.0000
    Residual |  13624.4316  9998   1.3627157           R-squared     =  0.7040
-------------+------------------------------           Adj R-squared =  0.7040
       Total |   46026.361  9999  4.60309641           Root MSE      =  1.1674

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           x |   1.800144   .0116741   154.20   0.000      1.77726    1.823027
       _cons |   .0051587   .0116735     0.44   0.659    -.0177238    .0280412
------------------------------------------------------------------------------

. estat ovtest

Ramsey RESET test using powers of the fitted values of y
       Ho:  model has no omitted variables
                F(3, 9995) =      0.31
                  Prob > F =      0.8169

Best
Daniel

Last edited by daniel klein; 29 Nov 2019, 11:49.

Comment

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17707
#6

29 Nov 2019, 13:04

Adeola:
as an aside to previous excellent replies, I find really have to believe that your data generating process includes one predictor only.

Kind regards,
Carlo
(Stata 19.0)
Comment
Adeola Kolade

Join Date: Aug 2017

Posts: 76
#7

30 Nov 2019, 06:20

Thanks for your reply. My model is actually:
Cogskill_i= β₀+ β₁HC_i + β₂X_i+ β₃Z_i+ ε_i…………….…….……. (1)
Where Cogskill represents cognitive skills in childhood, HC_i represents Head circumference, X_irepresents respondent’s characteristics (sex, social class, birth weight, days read to), Z_irepresents respondents’ parental characteristics (mothers age, mother’s smoking habit, mother’s education and father’s education). I however do not have data on parental behaviour which could influence both HC and Cognititve skill of the child. This is why i wanted a method that can solve for endogeneity (omitted variable bias).
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2167
#8

30 Nov 2019, 12:21

You're controlling for some of the mother's habits, so I'm not sure what other "behavior" you're thinking about that would affect HC. Maybe drinking during pregnancy? In any case, you might want to do a sensitivity analysis. You can essentially see how the estimate of β₁ changes as you allow different amounts of correlation between two error terms. I have an example of this somewhere.
Comment

Announcement

method for omitted variable bias cross sectional analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment