Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Steph Ki
    started a topic Two-step IV method with binary dependent variable

    Two-step IV method with binary dependent variable

    Dear Statalist,

    I estimate a logit model where the dependent variable is a dummy and the predictor also is a binary variable that is likely endogenous (simultaneity problem).

    1) I attempt to perform IV estimation, that is first run logit model of my endogenous varaible (binary) on one excluded instrument (that takes values 1, 2, 3, 4 and 5) and control variables (age, age square, living location-rural or urban-, gender...). And second, estimate a logit model of the dummy dependent variable on the fitted probabilities that replace the endogenous regressor.

    Question: Is that correct?

    2) I also use the two-step estimator, that is first estimate a logit or probit model of the binary endogenous regressor on the excluded instrument and control variables. And then, run ivregress 2sls of the dummy dependent variable on the fitted probabilities as excuded instrument adding control variables.

    Question: I wonder whether that is correct in my case? Or, can I just use ivreg2 command of stata to perform the IV estimation?

    Many thanks in advance

    Steph

  • Dounia Ouederni
    replied
    Hey ! I am really sorry to disturb you, but by any chance do you know how to do an IV biprobit regression when your dependent variable and endogenous variable are the same ?

    Thank you in advance,
    Dounia
    Last edited by Dounia Ouederni; 29 Apr 2024, 14:40.

    Leave a comment:


  • Kerstin Schmidt
    replied
    Thank you Jeff Wooldridge!!!

    Leave a comment:


  • Jeff Wooldridge
    replied
    I discuss this in Section 15.7.3 in my 2010 MIT Press book. Example 15.4 gives an example, and I provide a few citations to other papers where 2SLS and biprobit are compared. I don't regularly check my gmail account and so I missed Kerstin's query.

    Leave a comment:


  • Luciana Jaime
    replied
    Originally posted by Kerstin Schmidt View Post
    Two questions regarding post #3 by Jeff Wooldridge:

    (1) Is there any citable literature where you/he propose/s the two methods: (1) standard 2SLS and (2) bivariate probit model?
    (2) To not mess it up: When comparing the estimates of the standard 2SLS and bivarite probit model, both estimates are interpreted as percentage points, right?
    Hello, Kerstin:

    I just recently discovered you had the same problem as me. I was wondering if you found the studies you were looking for here? Please if you could help me, thanks.

    Leave a comment:


  • Kerstin Schmidt
    replied
    Two questions regarding post #3 by Jeff Wooldridge:

    (1) Is there any citable literature where you/he propose/s the two methods: (1) standard 2SLS and (2) bivariate probit model?
    (2) To not mess it up: When comparing the estimates of the standard 2SLS and bivarite probit model, both estimates are interpreted as percentage points, right?

    Leave a comment:


  • NJ JAIN
    replied
    Hi,

    Can I seek guidance on the following two questions please.

    Q1: For a problem involving binary dependent, binary endogenous and two binary instruments, can I use the below-mentioned approach? Please advise.

    Step1: Estimate the endogenous variable using the two binary instruments and other exogneous covariates.


    eststo Probit_Gov1: probit CS4_govt CS23 CS22 i.TA10A Nchild_adult Income_person i.RO3 i.RO5 COPC i.HHEDUC I.ED6 CS10-CS12 i.CS8 CS5 i.ID11 ED7 i.ID13 i.STATE [fweight = FWT], vce(cluster IDHH)
    predict Probit_Gov1

    Step 2: Estimate the endogenous variable from the estimate in step 1 and other exogenous covariates.

    eststo Probit_Gov2: probit CS4_govt Probit_Gov1 i.TA10A Nchild_adult Income_person i.RO3 i.RO5 COPC i.HHEDUC I.ED6 CS10-CS12 i.CS8 CS5 i.ID11 ED7 i.ID13 i.STATE [fweight = FWT], vce(cluster IDHH)
    predict Probit_Gov2

    Step 3: Use the estimate from step 2 as an instrument in IVProbit.

    ivprobit TA10B_flag i.TA10A Nchild_adult Income_person i.RO3 i.RO5 COPC i.HHEDUC I.ED6 CS10-CS12 i.CS8 CS5 i.ID11 ED7 i.ID13 i.STATE ////
    (CS4_govt = Probit_Gov1)[fweight = FWT], twostep

    In the above code, TA10B_flag and CS4_govt are both binary while CS23, CS22 are the original binary instruments.


    Q2: Does IV Probit take a very long time to execute?



    Leave a comment:


  • Steph Ki
    replied
    Dear JW thank you too much for your answer and suggestions. I have read something for computing average margin effect
    Code:
    *biprobit
    biprobit (y1 y2 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11) (y2 z x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11)
    mfx
    
    replace y2=0
    predict double adjpredy2_0
    replace y2=1
    predict double adjpredy2_1
    gen double mey2= adjpredy2_1 - adjpredy2_0
    
    sum adjpredy2_1 adjpredy2_0 mey2 if e(sample)
    
    *And I get the following:
      
    Variable Obs Mean Std. Dev. Min Max
    adjpredy2_1 350 .2572922 .3242765 6.89e-14 .9762893
    adjpredy2_0 350 .2831236 .3444418 6.89e-14 .985427
    mey2 350 -.0258314 .02986 -.0828319 1.11e-16
    If the code is correct I have to use bootstrapping to get standard errors?

    However, I have one more question regarding the standard linear model estimated by 2SLS (in my previous post), I get higher value of Kleibergen-Paap rk Wald F statistic (with the option robust) and I wonder whether that is correct? and whether I can rely on it for the relevance of the instrument?

    Thank you very much

    Leave a comment:


  • Jeff Wooldridge
    replied
    Actually, I have my doubts about the margins command, too. In the past, I've done the calculation by hand. The downside is having to use something like bootstrapping to get a standard error.

    Leave a comment:


  • Jeff Wooldridge
    replied
    I don't think that -mfx- command is doing what you want. It is impossible for the average marginal effect to be positive when the coefficients are negative. I'm wondering if the command you used is somehow taking into account the equation for y2. I'm suspicious because a marginal effects is being reported for z, and z does not appear in the main equation of interest. What I see from the coefficient estimates between 2SLS and biprobit is consistent. I would use the -margins- command, as it's much more recent.

    Leave a comment:


  • Steph Ki
    replied
    The correct commands that I use

    Code:
    y1: binary dependent varaible
    y2: binary endogenous variable
    z: instrument variable
    x: set of control variables
    
    
    *Standard linear model estimated by 2SLS
    ivreg2 y1 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 (y2 = z), first
    
    *biprobit
    biprobit (y1 y2 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11) (y2 z x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11)
    
    *I compute the average marginal effect from the biprobit using the following
    mfx compute, force
    Last edited by Steph Ki; 23 Mar 2017, 11:06.

    Leave a comment:


  • Steph Ki
    replied
    Dear Statalist, thank you for your posts and recommendations. I appreciate your replies very much.

    However, I estimate the two models 2SLS and "biprobit"

    Code:
    y1: binary dependent varaible
    y2: binary endogenous variable
    z: instrument variable
    x: set of control variables
    
    *Standard linear model estimated by 2SLS
    ivreg2 y1 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 (y2 = z), r
    
    *biprobit
    biprobit (y1 y2 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11) (y2 z x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11)
    
    *To compute the average marginal effect from the biprobit I use the following command
    mfx compute, force
    However, when comparing the avearge marginal effect from biprobit with the 2SLS estimate, I find that the coefficients from the two models are differents. For instance, variable x3, x4, x9 appear negative in 2SLS estimate but have positive average marginal effect from the biprobit.

    In this situation what is better for me to do?



    Click image for larger version

Name:	Results -1.png
Views:	1
Size:	153.5 KB
ID:	1379805


    Click image for larger version

Name:	Results -2.png
Views:	1
Size:	127.9 KB
ID:	1379806

    Click image for larger version

Name:	Results -3.png
Views:	1
Size:	86.3 KB
ID:	1379807
    Click image for larger version

Name:	Results -4.png
Views:	1
Size:	141.8 KB
ID:	1379808

    Leave a comment:


  • Jeff Wooldridge
    replied
    The first proposed method does not estimate anything interesting -- at least not that anyone has shown. It is an example of a "forbidden regression," where one tries to incorrectly extend 2SLS to a nonlinear model. As I tell my students: A method that plugs in fitted values into nonlinear second stages should be assumed inconsistent unless you prove otherwise.

    The second method doesn't make much logical sense. If you are going to acknowledge the discreteness of the endogenous explanatory variable it seems odd to then use a linear model for the main variable, y1. If y1 were continuous, so that a linear model could reasonably represent a conditional mean, then the method would be fine. In fact, it's a method I cover in Section 21.4 in my 2010 MIT Press book. (Incidentally, there isn't "measurement error" in the instrument. It's estimation error, or sampling error, which goes away as N gets large. That's much different then measurement error, which is a population, not a sampling, issue.)

    But with a binary y1 and binary y2, you should use two methods.

    1. A standard linear model estimated by 2SLS. This is what Angrist and Pischke propose in "Mostly Harmless Econometrics."

    2. Use the so-called "biprobit" model, where y1 and y2 are modeled as probits. This is a joint maximum likelihood procedure. You should compute the average marginal effect from the biprobit and compare it with the 2SLS estimate.

    JW

    Leave a comment:


  • Phil Bromiley
    replied
    This is not as simple as it looks - your instrument has measurement error among other things. Why not look at ivprobit instead of programming this yourself? You should also look at cmp (user written) and GSEM which can do this kind of model.

    You'll generally get a better response if you follow the FAQ on asking questions - provide Stata data in code delimiters, Stata output, and sample data using dataex.

    Leave a comment:

Working...
X