Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • instrumental variable for quadratic term

    In a regression model with an explanatory variable (x) we have also the squared term of the variable x (sq_x) to capture possible nonlinearities in the relationship.
    In order to check for endogeneity of x we use an instrumental variable (z) that is strongly correlated with the explanatory variable x. What kind of instrumental variable could be used for the squared term of x (sq_x) in the two stage least squares regression (TSLS/IV)?
    I'm aware that in Wooldridge (2000), there is a nice discussion about this issue.
    There are two approaches you can adopt. The first is to use the squares of the other exogenous variables as additional instruments. The second approach requires the use of (xhat)2 as instrument in the second-stage regression. Therefore, after obtaining (xhat) from first-stage regression and squaring this value, you must estimate:
    ivreg2 y (x x2 = z (xhat)2 ) c
    y-dependent
    c- controls
    Note that this approach adds a nonlinear function of your exogenous variables to your instrument set. The (xhat)2 should not be used as "regressor" in the second-stage model instead as an "instrument".
    But I don't know how to implement this theory in Stata.
    Thank you very much in advance for suggestions!

  • #2
    You just generate the variables, e.g.,

    Code:
    gen z2 = z^2
    
    reg x z other_controls
    
    predict xhat
    
    gen xhat2 = xhat^2
    
    ivregress 2sls y (x x2 = z z2) other_contorls
    
    ivregress 2sls y (x x2 = xhat xhat2) other_contorls

    Comment


    • #3
      Originally posted by Joro Kolev View Post
      You just generate the variables, e.g.,

      Code:
      gen z2 = z^2
      
      reg x z other_controls
      
      predict xhat
      
      gen xhat2 = xhat^2
      
      ivregress 2sls y (x x2 = z z2) other_contorls
      
      ivregress 2sls y (x x2 = xhat xhat2) other_contorls
      Thank you Joro! It's really helpful!

      Comment


      • #4
        Originally posted by Joro Kolev View Post
        You just generate the variables, e.g.,

        Code:
        gen z2 = z^2
        
        reg x z other_controls
        
        predict xhat
        
        gen xhat2 = xhat^2
        
        ivregress 2sls y (x x2 = z z2) other_contorls
        
        ivregress 2sls y (x x2 = xhat xhat2) other_contorls
        Sorry to butt into an old thread. Are these two methods meant to give the same results? My instrument is a dummy, so I can only perform one of the two following (as z^2 = z):

        [CODE]
        ivregress 2sls y (x x2 = z xhat2) other_controls

        ivregress 2sls y (x x2 = xhat xhat2) other_controls
        [CODE]

        However, these two give me somewhat different results. What's the reason for that?

        Also, when is inference valid? I use cluster-robust standard errors on my ID variable. Is it correct to cluster on ID in my first stage (where I find xhat) and in my second stage?

        Sincerely,
        Rune

        Comment

        Working...
        X