Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heteroscedastic Probit vs. Homoscedastic PSM & IV

    Fei Men wrote me privately:

    I have had a very painstaking dilemma when researching on the effect of divorce (dichotomous variable) on mothers' food security (dichotomous). [...]

    Long story short, the heteroscedastic probit (Stata -hetprob-) model got me a small and non-significant effect of divorce with significant lnsigma2 for the divorce dummy while the homoscedastic probit, propensity score matching (PSM), and instrumental variable (IV) model have all got me a fairly large and highly significant divorce coefficient. Baseline characteristics such as income and homeownership are controlled in all models. Results are consistent across a variety of specifications in both heteroscedastic and homoscedastic models.

    Given the contrasting results across different residual variance assumptions, I was wondering which story I should put more faith in, especially when PSM and IV approaches got me significant divorce effect. Is there a way to correct for heteroscedasticity in PSM and IV models?
    I am very reluctant to trust hetprob, as its results are very sensitive to the correct specification of both the heteroscedastic part of the model and the main part of the model. There is no way in which we can directly see the errorterm, instead the heteroscedasticity manifests itself in making linear effects (slightly) non-linear. This is what is used to identify the heteroscedasticity in hetprob. However, if the effect wasn't linear to begin with, then hetprobit will incorrectly assume that that deviation from linearity is due to heteroscedasticity and "adjusts" all effects with that incorrect estimate for the residual error term. Similarly an incorrect specification of the heteroscadastic part will result noticably biased results. Below is a simulation that illustrates this point:

    Code:
    . clear all
    
    . set seed 123456
    
    .
    . program define sim, rclass
      1.     drop _all
      2.     set obs 1000
      3.     gen x = rnormal()
      4.
    .        // hetprobit is correctly specified
    .        gen ystar1 = 1 + x + rnormal(0,exp(.5*x))
      5.     gen byte y1 = ystar1 > 0
      6.     hetprob y1 x, het(x)
      7.     return scalar b1 = _b[x]
      8.
    .        // heteroscedasticity is incorrectly specified
    .        gen ystar2 = 1 + x + rnormal(0, exp(.5*x + .25*x^2))
      9.     gen byte y2 = ystar2 > 0
     10.     hetprob y2 x, het(x)
     11.     return scalar b2 = _b[x]
     12.
    .        // no heteroscedasticity, but incorretly specified x
    .        gen ystar3 = 1 + x + .5*x^2 + rnormal()
     13.     gen byte y3 = ystar3 > 0
     14.     hetprobit y3 x, het(x)
     15.     return scalar b3 = _b[x]
     16.
    .        probit y3 c.x##c.x
     17.     return scalar b4 = _b[x]
     18. end
    
    . simulate b1=r(b1) b2=r(b2) b3=r(b3) b4=r(b4), reps(2000) nodots : sim
    
          command:  sim
               b1:  r(b1)
               b2:  r(b2)
               b3:  r(b3)
               b4:  r(b4)
    
    
    . sum
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
              b1 |      2,000    1.005738    .0590741   .8329385   1.270037
              b2 |      2,000    .6337518    .0690086   .4348118   .8676383
              b3 |      2,000    .0000536    .0019632  -3.15e-06   .0851099
              b4 |      2,000    1.020313    .1348678   .6841113   1.535614
    
    . simsum b*, true(1) mcse bias dropbig
    
    Warning: found 2 observations with standardised b3 > 10
          +----------+
          |       b3 |
          |----------|
      41. | .0215943 |
    1086. | .0851099 |
          +----------+
    --> b3 have been changed to missing values for these observations
    
    Starting to process results ...
    
      +------------------------------------------------------------------------------------------------------------------+
      |    Performance measure      r(b1)     (MCse)       r(b2)     (MCse)       r(b3)     (MCse)      r(b4)     (MCse) |
      |------------------------------------------------------------------------------------------------------------------|
      | Bias in point estimate   .0057378   .0013209   -.3662483   .0015431   -.9999997   4.35e-08   .0203125   .0030157 |
      +------------------------------------------------------------------------------------------------------------------+
    So you need some very extensive model checking before you can believe the results from hetprobit.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------
Working...
X