Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Three-stage procedure with binary endogenous independent variable using 2SLS

    Dear statalist,

    I am trying to use probit to get more efficient estimates while evading the forbidden regression using 2SLS. I have an original regression with an endogenous binary independent variable and continuous dependent variable (Y). I follow Adams et al. (2009) using a three-stage procedure https://www.sciencedirect.com/scienc...27539808000388, also mentioned in Wooldridge (2010):

    1. Use probit to regress the endogenous variable on the instrument(s) and exogeneous variables
    Code:
    probit X1 Xi-Xn Z i.year i.ffi, vce(robust), where X1 is the endogenous dummy, Xi-Xn are the exogenous variables and Z is the instrument
    Code:
    predict shat, pr
    2. use the predicted values from the previous step in an OLS first-stage together with the exogenous (but without the instrumental) variables
    Code:
    regress Y X1 Xi-Xn shat i.year i.ffi, robust
    3. Do the second stage as usual
    Code:
    ivregress 2sls Y X1 Xi-Xn (X1 = shat), vce(robust) first

    My question is whether I correctly implement step 3 into STATA? Do I need to use ivregress here or is there a more suitable command?

  • #2
    2SLS is consistent with a binary endogenous variable so you don't need to do this yourself. You could also do it in GSEM or the eregress routines in Stata 15.

    Comment


    • #3
      Dear Dax,

      After doing the first step, you can simply use ivregress to estimate your model using the predicted probability as an instrument. In an early application of this approach (but maybe not the first one) we used both the fitted probabilities and the instruments in the IV regression, see the first two lines in page 291 of this paper:

      Windmeijer, Frank and Santos Silva, J.M.C. (1997), Estimation of Count Data Models with Endogenous regressors; An Application to Demand for Health Care, Journal of Applied Econometrics, 12(3), pp. 281-294.

      Best wishes,

      Joao

      Comment


      • #4
        Dear Phil and Joao,

        Thank you for your helpful input, this is clear to me now. Now I want to make sure that X1 really is an endogenous variable, by conducting the Hausman test of exogeneity. I use the following commands
        Code:
        probit X1 Xi-Xn i.year i.ffi, vce(robust)
        Code:
        predict shat, pr
        Code:
        regress Y X1 Xi-Xn shat i.year i.ffi, robust
        Now, shat should be statistically significant for X1 to be endogenous. My question is, can I include the year fixed effects (i.year) and industry fixed effects (i.ffi) in the probit regression this way? I've read some contradicting views on this part and shat is only significant if I add these fixed effects in the probit estimation.

        Comment


        • #5
          Dear Dax,

          In principle, you should not include "fixed effects" in a probit model as the estimator will not be consistent. However, in this case you do not really care for the consistency of your probit regression; you are just doing it to transform the data. So, I would say it is OK to do it in this very particular context.

          Best wishes,

          Joao

          Comment


          • #6
            Dear Joao,

            Thanks a lot, this solves my problem!

            Best,

            Dax

            Comment


            • #7
              Dear Joao Santos Silva

              Hi, I am New to this site as well as new to Stata. I am doing my Mater degree research by using gravity model on Impact of infrastructure investment on Trade, as an empirical investigation for Sri Lanka. I am Using 30 years data of ten major Exporters of Sri Lanka and I am using panel data.

              Sri Lanka`s Export values to those countries are the dependent variable of my model and GDP (of Sri lanaka and trade Partner countries), Capital stock data (of Sri lanaka and trade Partner countries), and distance between two capital cities are the independent variables.

              my regression model is as follow.

              log (X 1j,t ) = α + β1 log(Y 1,t ) + β2 log(Y j,t ) + β3 log(GG 1,t ) + β4 log(GG j,t ) + β5 (D1j ) + U1jt


              Where X 1j,t are exports from country 1 (Sri Lanka) to country j (trading partner) at time t, Y 1,t and Y j,t are the GDPs of country 1 (Sri Lanka)and j, (trading partner) respectively, at time t, GG 1j,t are General Government capital stock of country 1 (Sri Lanka) and j, (trading partner) respectively, at time t and D1j is the distance between the capital cities of the two countries

              my problems:

              1. How can I incorporate distance data to my main data set. ( I have already combined GDP, Capital stock Data and Export values in Stata format and ran basic commands and got summary of my data other than distance data)

              2. What kind of variables should i create to get output for the above regression?

              Therefore it is grateful and much appreciated if you could instruct me how can I run my regression and get output with distance data as well please.

              kind regards

              Kuloja

              Comment


              • #8
                Dear Statalist,

                As argued by Stock and Yogo (2002), I need an F-statistic of at least 10 to conclude that I have no weak instruments. How can I find the F-statistic of my instruments? The predicted probability of my first-stage probit regression serves as an instrument. In addition, I have another instrument (Z). What code do I need to use in order to find the F-statistic of my instruments?

                Comment

                Working...
                X