Endogeneity due to measurement errors with limited dependent variable

Gaston Fernandez

Join Date: Jul 2015

Posts: 27
#1

Endogeneity due to measurement errors with limited dependent variable

01 Apr 2020, 14:20

Dear all,

For my research, I'm evaluating the effect of a proxy variable of abilities (i.e. a standard IQ test) on three limited dependent variables, namely: a dichotomous variable, a discrete variable that goes from 0 to 11, and a continuous variable that ranges from 0 to 1.

Of course, this IQ test might not be an error-free measure of abilities. That is why I would like to instrument it and see if I get something interesting.

However, since I'm considering limited dependent variables as a left-hand-side variable, I am not sure if there is some framework similar to IV using OLS that might be suitable for my setting.

I would be pleased if someone can give some advice on this.

Thanks!

Last edited by Gaston Fernandez; 01 Apr 2020, 14:28.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

02 Apr 2020, 11:33

You will increase your chances of useful answer by following the FAQ on asking questions-provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

You may find this helpful. http://www.stata.com/meeting/germany...ukker_gsem.pdf
Comment
Gaston Fernandez

Join Date: Jul 2015

Posts: 27
#3

03 Apr 2020, 06:22

Thanks for your reply, Phil. I will look at what you suggested.

Thanks for your advice as well.

So, as I mentioned above, I have three dependent variables: a dichotomous variable (named garp), a discrete variable that goes from 0 to 11 (named vgarp), and a variable that ranges from 0 to 1 (named afriat). See below a summary of these variables.

Code:

sum garp vgarp afriat Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- garp | 206 .538835 .4997039 0 1 vgarp | 206 1.946602 2.741206 0 11 afriat | 206 .949095 .1174611 .33333 1

The goal is to analyze the effect (if any) of a proxy variable of cognitive abilities (i.e. a standard IQ test) on these three dependent variables. My independent variable, named crt, is defined as the total number of correct answers in the test. See below.

Code:

tab crt, m number of | correct | answers | Freq. Percent Cum. ------------+----------------------------------- 0 | 56 27.18 27.18 1 | 40 19.42 46.60 2 | 46 22.33 68.93 3 | 64 31.07 100.00 ------------+----------------------------------- Total | 206 100.00

Therefore, and since I am considering in my analysis limited dependent variables, I am wondering if there is any method for such a setting, with which I could try to assess the potential bias in crt due to measurement error.

Any suggestions will help.

Thanks!
Comment
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#4

03 Apr 2020, 06:51

Hi Gaston, as a starting point, you may consider -eprobit- for garp; and -eoprobit-, -eintreg-, or -ivpoisson- for vgarp, depending on the informational content of that variable. The last variable, afriat, seems to be a fractional outcome, and I do not know whether there is a Stata command that allows one to estimate a fractional response model with endogenous regressors; if there is none, you may consider writing a simple program that applies -fracreg- with -bootstrap- and the control function approach described in Jeff Wooldridge's Econometric Analysis of Cross Section and Panel Data (2nd ed.).

P.S.: Just noticed that your endogenous regressor is better modelled as a count variable instead of a continuous variable. That may complicate things a lot, I guess someone more familiar with this type of application can provide better suggestions.

Last edited by Hong Il Yoo; 03 Apr 2020, 07:25.
Comment

Gaston Fernandez

Join Date: Jul 2015
Posts: 27

03 Apr 2020, 07:25

Thanks for your answer, Hong. It is really helpful.

About your comment

-eoprobit-, -eintreg-, or -ivpoisson- for vgarp, depending on the informational content of that variable

please, look at the distribution of vgarp:

Code:

      vgarp |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |        111       53.88       53.88
          1 |          5        2.43       56.31
          2 |         31       15.05       71.36
          3 |         11        5.34       76.70
          4 |          7        3.40       80.10
          5 |         18        8.74       88.83
          6 |          8        3.88       92.72
          7 |          3        1.46       94.17
          8 |          2        0.97       95.15
          9 |          5        2.43       97.57
         10 |          2        0.97       98.54
         11 |          3        1.46      100.00
------------+-----------------------------------
      Total |        206      100.00

Thanks again.

Last edited by Gaston Fernandez; 03 Apr 2020, 07:34.

Comment

Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#6

03 Apr 2020, 07:34

I think it would be helpful if you could tell us more about what 0,1,2,3,..., 11 refer to. Is it a count of something (e.g. the number of correct answers in an exam)? Or an ordinal outcome (e.g. 0 for very unhappy, 1 for slightly unhappy, and so on)? Or an interval outcome (e.g. 0 for income in $0 to $500, 1 for income in $501-$1000, and so on)?
Comment
Gaston Fernandez

Join Date: Jul 2015

Posts: 27
#7

03 Apr 2020, 07:59

Thanks for your question.

The variable vgarp refers to the number of violations of revealed preferences axioms (i.e. GARP) while choosing between consumption alternatives. For example, a value of vgarp = 0, refers to 0 violations, whereas vgarp = 11 refers to 11 violations. So in fact, it is just counting the number of events (i.e. violations) for each individual in my sample.
Comment
Hong Il Yoo

Join Date: Jan 2015

Posts: 292
#8

03 Apr 2020, 10:59

Thanks, in that case, I think -ivpoisson- is more natural than -eoprobit- and -eintreg-. You're also likely to find a lot of useful information from previous threads on count data models with endogenous regressors.
Comment
Gaston Fernandez

Join Date: Jul 2015

Posts: 27
#9

03 Apr 2020, 13:08

Thanks a lot for your suggestions.
Comment

Announcement

Endogeneity due to measurement errors with limited dependent variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment