Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins after Control Function Approach with probit

    Hi all,

    I am applying control function approach described in Wooldridge(2014) as follow:

    regress End_var i.IV Control_variables
    predict resid_1, residuals
    probit Y End_var c.End_var##c.End_var resid_1 c.resid_1##c.resid_1 Control_variables, vce(bootstrap, reps(1000))

    After two-stage estimation, I'd like to get the graph for the impact of End_var on Y. Can I use margins command as follows?

    margins, at(End_var=(0(1)19)) atmeans
    marginsplot

    What if I do not write "atmeans"?

    Thank you very much for your help!

  • #2
    I figure out -margins- by reading the manual.

    However you are not implementing the procedure correctly. You need to bootstrap both stages of the procedure.

    Further error is that you should not include the square of the residual, you should include only the residual itself.
    Last edited by Joro Kolev; 13 Oct 2020, 04:37.

    Comment


    • #3
      Thank you for your reply!

      Could you please explain why I need to bootstrap in the first stage? I thought I only need to correct standard error in the second stage?

      Also, since I want to see the impact of End_var^2, Wooldridge(2014) suggests adding the square of residuals in the second stage.

      Thank you!

      Comment


      • #4
        Which paper is Wooldridge(2014), and on which page does Wooldridge(2014) suggest that you should include the square of the residual? I do not think this is true.

        As to why you need to bootstrap both stages, there are two explanations:

        1. Reference to authority: Because Wooldridge(2014) says so. Just read again whichever is this paper Wooldridge(2014) around where he mentions bootstrap, and you will see that he says somewhere that you should bootstrap both stages.

        2. The intuitive explanation: If you just bootstrap the second stage, you are effectively treating the first stage as given, nonrandom. This would be correct if you knew the true population parameters from the first stage, but you dont. You are estimating them subject to sampling error. To take account of the fist stage estimation sampling error, you need to bootstrap both stages as one whole object.

        Originally posted by Sidika Tunc Candogan View Post
        Thank you for your reply!

        Could you please explain why I need to bootstrap in the first stage? I thought I only need to correct standard error in the second stage?

        Also, since I want to see the impact of End_var^2, Wooldridge(2014) suggests adding the square of residuals in the second stage.

        Thank you!

        Comment


        • #5
          Thanks for your explanation!

          The paper is Wooldridge, Jeffrey M. "Control function methods in applied econometrics." Journal of Human Resources 50.2 (2015): 420-445. (Please see page 437).

          So do you suggest the following approach:

          regress End_var i.IV Control_variables, vce(bootstrap, reps(1000))
          predict resid_1, residuals
          probit Y End_var c.End_var##c.End_var resid_1 Control_variables, vce(bootstrap, reps(1000))

          Here, my question is whether I can safely use margins after this two-stage estimation?

          Thank you!

          Comment


          • #6
            Look at this thread here:

            https://www.statalist.org/forums/for...quadratic-term

            I think it is exactly your case, and Professor Wooldridge himself elaborates on the method. You do not put a squared residual. I agree that formula with squared residual appears in the page 437 of the paper you cite, but I think this is something different. I have studied this paper quite carefully, but I do not remember everything from there. I guess this with the squared residual and the interaction of the residual with exogenous controls is when the first stage is suspected to be nonlinear. (But I do not remember for sure).

            And no, the bootstrapping both stages goes something like this, and it is not trivial, pseudo code follows, to make this work requires... well, work.

            Code:
            program define mytwostage, eclass
            
            reg y x
            
            predict yres, resid
            
            probit c.x##c.x yres
            
            end
            
            bootstrap: mytwostage

            Comment


            • #7
              You may try any function of the residuals that you want to allow a flexible functional form. For my JHR paper I wrote a program to implement the two steps and then call the program. It doesn’t work to bootstrap separately as in #5. If you send me an email at my MSU address I will send the code I used.

              Comment


              • #8
                Professor Wooldridge, can you please go again with this inclusion of the first stage residual, residual squared, and residual interacted with the exogenous first stage regressors? Was I right to say that this is done to allow for potentially nonlinear first stage? Or am I completely at loss as to why you are doing it around page 437 of the cited paper?

                And I will write an email to you with a request for the data and the code. I sent you once this request, but you kindly disregarded me :-).

                I have a competing (and I think much better method) of computing the correct standard errors, and for me to be able to spin out my paper I should be able to reproduce more or less what you are doing in your JHR 2015 paper.

                Originally posted by Jeff Wooldridge View Post
                You may try any function of the residuals that you want to allow a flexible functional form. For my JHR paper I wrote a program to implement the two steps and then call the program. It doesn’t work to bootstrap separately as in #5. If you send me an email at my MSU address I will send the code I used.

                Comment


                • #9
                  Joro Kolev Thank you very much for sharing the earlier post!
                  Jeff Wooldridge Thank you very much for your reply! I've sent email to your MSU address. Looking forward to your reply, thank you!

                  Comment


                  • #10
                    Joro: My apologies. I certainly did not intend to disregard. I can go search for your old email or you might send me a new one.

                    Once one goes the CF route, any second stage can be viewed as an approximation to the true conditional model. Putting in nonlinear functions of the CF has nothing to do with the first stage being nonlinear; in fact, it is generally hard to justify in those cases. It actually works best when the first stage is linear in parameters. Once the CFs are included, they act as any other control variable. We probably would not compute marginal effects for the CFs, but one could in order to gauge their importance.

                    Jeff

                    Comment


                    • #11
                      Jeff Wooldridge

                      Dear Jeff,

                      Thank you very much for sharing the program. Now, I know exactly how I should apply the control function approach.

                      Now, I want to plot the impact of my endogenous variable on my dependent variable.

                      Can I still use margins command as follows:

                      margins, at(End_var=(0(1)19)) atmeans

                      I want all other variables at their means, but I am not sure whether residuals should be at their means or not.

                      Thank you very much!

                      Comment


                      • #12
                        My preference is to average out all of the variables in the response function rather than putting in the means. This produces what Blundell and Powell refer to as the Average Structural Function. It means all units contribute to the estimates, rather than those just at the mean. At a minimum, I would average out the resid_1 (and their squares, and so on), and so I would specify the variables in the "atmeans" option and exclude resid_1.

                        Comment


                        • #13
                          Thank you very much for your quick response! I appreciate your help a lot.

                          Comment


                          • #14
                            Joro Kolev I found your email from a gmail account. Should I send there?

                            Comment


                            • #15
                              Yes Professor Wooldridge, please send it to the gmail address.

                              And if possible both the do files and the dta files, so that I can build upon your work in the article.

                              I will bounce back an email with what is my idea basically (it is something very simple), you might find interesting the alternative way I have in mind for computing the correct standard errors.

                              Originally posted by Jeff Wooldridge View Post
                              Joro Kolev I found your email from a gmail account. Should I send there?

                              Comment

                              Working...
                              X