IV-oprobit using cmp command

Felix Lukowski

Join Date: Sep 2014

Posts: 5
#1

IV-oprobit using cmp command

13 Sep 2014, 13:27

Dear all,

I am running an IV ordered probit regression using the cmp command. The dependent variable is education level (discrete and ordered), which is regressed on patience and a set of controls. My idea is to instrument patience with parents’ education.

The cmp command (approach 1) is the following:

cmp (edulevel=patience `controls') (patience=parentsedu `controls') , ind($cmp_oprobit $cmp_cont) nolr

I tried to replicate the results manually by using a two-step procedure (approach 2), where I regress patience on its instruments, and use the predicted values of patience in the ordered probit estimation:

regress patience parentsedu `controls'
predict patiencehat
oprobit edulevel patiencehat `controls'

The results differ substantially. I noticed that using cmp the sample sizes change between the different steps. But even when keeping the sample size constant, I do not obtain the same results as with the first approach.

How exactly does cmp fit the model, respectively why do the results of the approaches differ? Does the manual implementation of approach 2 make sense, or was it a bad idea in the first place?

Thank you in advance.
Tags: None
David Roodman

Join Date: Jul 2014

Posts: 479
#2

15 Sep 2014, 06:03

Hi Felix,
I describe the econometrics of cmp in this paper. The model is that there is an underlying bivariate normal distribution for the error terms in the two equations. cmp uses Maximum Likelihood to directly model this error process.

Your two-stage approach is intuitive, but inconsistent. Unfortunately, I cannot reconstruct the reasoning right now (I should be able to!). Here is a demonstration:

Code:

set obs 10000 mat C = 1, .5, 1 drawnorm e1 e2,corr(C) cstor(lower) // 1st- and 2nd-stage errors, correlated 0.5 drawnorm z // instrument gen x = z+e1 // instrumented variable gen ystar=x+e2 // unobserved 2nd-stage dependent variable egen y = cut(ystar), at(-10 -1 0 1 2 10) // censored, observed version of dep var oprobit y x // inconsistent (true coef = 1) cmp (y=x), ind(5) qui // same cmp (y=x) (x=z), ind(5 1) qui // consistent regress x z predict xhat oprobit y xhat // two-stage approach inconsistent too

Rivers and Vuong is a key paper discussing methods of consistent estimation for the IV-probit set-up, which is of course closely related to yours. However, it's a bit out of date because it is written on the assumption that direct ML estimation, as in cmp, is computationally prohibitive. But computers are a lot faster now and when ML is practical, it's the most efficient.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#3

15 Sep 2014, 10:16

Just to follow up on David's point. You can easily see the problem in the standard probit case (Rivers and Vuong). If y2 is the endogenous explanatory variable and y2 = z*g2 + v2 is the reduced form, then you are replacing y2 with the right hand side. The fitted values are essentially z*g2 for parameters g2. But you can't ignore the addition to the error term, a1*v2, where a1 is the coefficient on y2. The effective error using the plug-in approach -- sometimes called the "forbidden regression" -- is e1 + a1*v2, where e1 has a standard normal distribution. Now the error variance is a1^2*sv2^2 + se1^2 = a1^2sv2^2 + 1 because e1 has unit variance. So the coefficients you estimate are divided by a constant greater than one. You want a1 but you get a1/sqrt(a1^2sv2^2 + 1). It is very clumsy to undo this; I show how to in Chapter 15 of my 2010 MIT Press book. As David said, MLE is relatively easy now and you should use that.

An alternative is a control function approach to estimate the average partial effects, but there is nothing to be gained over MLE in your setup.
Comment
Felix Lukowski

Join Date: Sep 2014

Posts: 5
#4

16 Sep 2014, 02:59

Thank you, David and Jeff. Your replies were very helpful and instructive.
Comment
David Roodman

Join Date: Jul 2014

Posts: 479
#5

16 Sep 2014, 10:02

Thank you, Jeff! I knew I had read about this in your textbook, but I couldn't find it, and couldn't remember the term. "Spurious regression," I was thinking... Page 236 of the 2002 edition.
Comment
Nikos Korompos

Join Date: Jan 2017

Posts: 66
#6

12 Feb 2017, 04:47

Hi,

I have a similar case, in which my endogenous variable is ordinal (6-scale) and the response variable (5-scale) is also ordinal. Could I use the cmp command, with ind($cmp_oprobit $cmp_oprobit), or it generally inconsistent?

In case this is not the appropriate methods, could you please propose me an alternative (i.e. plain 2SLS, or treating the endogenous as continuous)? I have seen some papers doing that, but I am not sure if it's appropriate method.

Thank you in advance.

Best regards,

Nikos
Comment
Nikos Korompos

Join Date: Jan 2017

Posts: 66
#7

20 Oct 2017, 06:30

Hi, could you please help me with the comment above about the cmp command?

Thank you in advance. Much appreciated.

Best,

Ilias
Comment

Announcement

IV-oprobit using cmp command

Comment

Comment

Comment

Comment

Comment

Comment