Margins after Control Function Approach with probit

Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#1

Margins after Control Function Approach with probit

13 Oct 2020, 04:01

Hi all,

I am applying control function approach described in Wooldridge(2014) as follow:

regress End_var i.IV Control_variables
predict resid_1, residuals
probit Y End_var c.End_var##c.End_var resid_1 c.resid_1##c.resid_1 Control_variables, vce(bootstrap, reps(1000))

After two-stage estimation, I'd like to get the graph for the impact of End_var on Y. Can I use margins command as follows?

margins, at(End_var=(0(1)19)) atmeans
marginsplot

What if I do not write "atmeans"?

Thank you very much for your help!
Tags: binary outcome, control function, margins, probit
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

13 Oct 2020, 04:34

I figure out -margins- by reading the manual.

However you are not implementing the procedure correctly. You need to bootstrap both stages of the procedure.

Further error is that you should not include the square of the residual, you should include only the residual itself.

Last edited by Joro Kolev; 13 Oct 2020, 04:37.
Comment
Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#3

13 Oct 2020, 05:12

Thank you for your reply!

Could you please explain why I need to bootstrap in the first stage? I thought I only need to correct standard error in the second stage?

Also, since I want to see the impact of End_var^2, Wooldridge(2014) suggests adding the square of residuals in the second stage.

Thank you!
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#4

13 Oct 2020, 05:25

Which paper is Wooldridge(2014), and on which page does Wooldridge(2014) suggest that you should include the square of the residual? I do not think this is true.

As to why you need to bootstrap both stages, there are two explanations:

1. Reference to authority: Because Wooldridge(2014) says so. Just read again whichever is this paper Wooldridge(2014) around where he mentions bootstrap, and you will see that he says somewhere that you should bootstrap both stages.

2. The intuitive explanation: If you just bootstrap the second stage, you are effectively treating the first stage as given, nonrandom. This would be correct if you knew the true population parameters from the first stage, but you dont. You are estimating them subject to sampling error. To take account of the fist stage estimation sampling error, you need to bootstrap both stages as one whole object.

Originally posted by Sidika Tunc Candogan View Post

Thank you for your reply!

Could you please explain why I need to bootstrap in the first stage? I thought I only need to correct standard error in the second stage?

Also, since I want to see the impact of End_var^2, Wooldridge(2014) suggests adding the square of residuals in the second stage.

Thank you!
Comment
Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#5

13 Oct 2020, 05:44

Thanks for your explanation!

The paper is Wooldridge, Jeffrey M. "Control function methods in applied econometrics." Journal of Human Resources 50.2 (2015): 420-445. (Please see page 437).

So do you suggest the following approach:

regress End_var i.IV Control_variables, vce(bootstrap, reps(1000))
predict resid_1, residuals
probit Y End_var c.End_var##c.End_var resid_1 Control_variables, vce(bootstrap, reps(1000))

Here, my question is whether I can safely use margins after this two-stage estimation?

Thank you!
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

13 Oct 2020, 06:05

Look at this thread here:

https://www.statalist.org/forums/for...quadratic-term

I think it is exactly your case, and Professor Wooldridge himself elaborates on the method. You do not put a squared residual. I agree that formula with squared residual appears in the page 437 of the paper you cite, but I think this is something different. I have studied this paper quite carefully, but I do not remember everything from there. I guess this with the squared residual and the interaction of the residual with exogenous controls is when the first stage is suspected to be nonlinear. (But I do not remember for sure).

And no, the bootstrapping both stages goes something like this, and it is not trivial, pseudo code follows, to make this work requires... well, work.

Code:

program define mytwostage, eclass reg y x predict yres, resid probit c.x##c.x yres end bootstrap: mytwostage
1 like
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2153
#7

13 Oct 2020, 06:15

You may try any function of the residuals that you want to allow a flexible functional form. For my JHR paper I wrote a program to implement the two steps and then call the program. It doesn’t work to bootstrap separately as in #5. If you send me an email at my MSU address I will send the code I used.
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#8

13 Oct 2020, 06:24

Professor Wooldridge, can you please go again with this inclusion of the first stage residual, residual squared, and residual interacted with the exogenous first stage regressors? Was I right to say that this is done to allow for potentially nonlinear first stage? Or am I completely at loss as to why you are doing it around page 437 of the cited paper?

And I will write an email to you with a request for the data and the code. I sent you once this request, but you kindly disregarded me :-).

I have a competing (and I think much better method) of computing the correct standard errors, and for me to be able to spin out my paper I should be able to reproduce more or less what you are doing in your JHR 2015 paper.

Originally posted by Jeff Wooldridge View Post

You may try any function of the residuals that you want to allow a flexible functional form. For my JHR paper I wrote a program to implement the two steps and then call the program. It doesn’t work to bootstrap separately as in #5. If you send me an email at my MSU address I will send the code I used.
Comment
Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#9

13 Oct 2020, 06:46

Joro Kolev Thank you very much for sharing the earlier post!
Jeff Wooldridge Thank you very much for your reply! I've sent email to your MSU address. Looking forward to your reply, thank you!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2153
#10

13 Oct 2020, 07:36

Joro: My apologies. I certainly did not intend to disregard. I can go search for your old email or you might send me a new one.

Once one goes the CF route, any second stage can be viewed as an approximation to the true conditional model. Putting in nonlinear functions of the CF has nothing to do with the first stage being nonlinear; in fact, it is generally hard to justify in those cases. It actually works best when the first stage is linear in parameters. Once the CFs are included, they act as any other control variable. We probably would not compute marginal effects for the CFs, but one could in order to gauge their importance.

Jeff
2 likes
Comment
Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#11

13 Oct 2020, 08:26

Jeff Wooldridge

Dear Jeff,

Thank you very much for sharing the program. Now, I know exactly how I should apply the control function approach.

Now, I want to plot the impact of my endogenous variable on my dependent variable.

Can I still use margins command as follows:

margins, at(End_var=(0(1)19)) atmeans

I want all other variables at their means, but I am not sure whether residuals should be at their means or not.

Thank you very much!
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2153
#12

13 Oct 2020, 08:44

My preference is to average out all of the variables in the response function rather than putting in the means. This produces what Blundell and Powell refer to as the Average Structural Function. It means all units contribute to the estimates, rather than those just at the mean. At a minimum, I would average out the resid_1 (and their squares, and so on), and so I would specify the variables in the "atmeans" option and exclude resid_1.
1 like
Comment
Sidika Tunc Candogan

Join Date: Oct 2020

Posts: 9
#13

13 Oct 2020, 09:08

Thank you very much for your quick response! I appreciate your help a lot.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2153
#14

13 Oct 2020, 09:39

Joro Kolev I found your email from a gmail account. Should I send there?
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#15

13 Oct 2020, 10:25

Yes Professor Wooldridge, please send it to the gmail address.

And if possible both the do files and the dta files, so that I can build upon your work in the article.

I will bounce back an email with what is my idea basically (it is something very simple), you might find interesting the alternative way I have in mind for computing the correct standard errors.

Originally posted by Jeff Wooldridge View Post

Joro Kolev I found your email from a gmail account. Should I send there?
Comment

Announcement

Margins after Control Function Approach with probit

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment