Heteroscedastic Probit vs. Homoscedastic PSM & IV

Maarten Buis

Join Date: Mar 2014
Posts: 3426

Heteroscedastic Probit vs. Homoscedastic PSM & IV

30 May 2016, 03:01

Fei Men wrote me privately:

I have had a very painstaking dilemma when researching on the effect of divorce (dichotomous variable) on mothers' food security (dichotomous). [...]

Long story short, the heteroscedastic probit (Stata -hetprob-) model got me a small and non-significant effect of divorce with significant lnsigma2 for the divorce dummy while the homoscedastic probit, propensity score matching (PSM), and instrumental variable (IV) model have all got me a fairly large and highly significant divorce coefficient. Baseline characteristics such as income and homeownership are controlled in all models. Results are consistent across a variety of specifications in both heteroscedastic and homoscedastic models.

Given the contrasting results across different residual variance assumptions, I was wondering which story I should put more faith in, especially when PSM and IV approaches got me significant divorce effect. Is there a way to correct for heteroscedasticity in PSM and IV models?

I am very reluctant to trust hetprob, as its results are very sensitive to the correct specification of both the heteroscedastic part of the model and the main part of the model. There is no way in which we can directly see the errorterm, instead the heteroscedasticity manifests itself in making linear effects (slightly) non-linear. This is what is used to identify the heteroscedasticity in hetprob. However, if the effect wasn't linear to begin with, then hetprobit will incorrectly assume that that deviation from linearity is due to heteroscedasticity and "adjusts" all effects with that incorrect estimate for the residual error term. Similarly an incorrect specification of the heteroscadastic part will result noticably biased results. Below is a simulation that illustrates this point:

Code:

. clear all

. set seed 123456

.
. program define sim, rclass
  1.     drop _all
  2.     set obs 1000
  3.     gen x = rnormal()
  4.
.        // hetprobit is correctly specified
.        gen ystar1 = 1 + x + rnormal(0,exp(.5*x))
  5.     gen byte y1 = ystar1 > 0
  6.     hetprob y1 x, het(x)
  7.     return scalar b1 = _b[x]
  8.
.        // heteroscedasticity is incorrectly specified
.        gen ystar2 = 1 + x + rnormal(0, exp(.5*x + .25*x^2))
  9.     gen byte y2 = ystar2 > 0
 10.     hetprob y2 x, het(x)
 11.     return scalar b2 = _b[x]
 12.
.        // no heteroscedasticity, but incorretly specified x
.        gen ystar3 = 1 + x + .5*x^2 + rnormal()
 13.     gen byte y3 = ystar3 > 0
 14.     hetprobit y3 x, het(x)
 15.     return scalar b3 = _b[x]
 16.
.        probit y3 c.x##c.x
 17.     return scalar b4 = _b[x]
 18. end

. simulate b1=r(b1) b2=r(b2) b3=r(b3) b4=r(b4), reps(2000) nodots : sim

      command:  sim
           b1:  r(b1)
           b2:  r(b2)
           b3:  r(b3)
           b4:  r(b4)


. sum

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          b1 |      2,000    1.005738    .0590741   .8329385   1.270037
          b2 |      2,000    .6337518    .0690086   .4348118   .8676383
          b3 |      2,000    .0000536    .0019632  -3.15e-06   .0851099
          b4 |      2,000    1.020313    .1348678   .6841113   1.535614

. simsum b*, true(1) mcse bias dropbig

Warning: found 2 observations with standardised b3 > 10
      +----------+
      |       b3 |
      |----------|
  41. | .0215943 |
1086. | .0851099 |
      +----------+
--> b3 have been changed to missing values for these observations

Starting to process results ...

  +------------------------------------------------------------------------------------------------------------------+
  |    Performance measure      r(b1)     (MCse)       r(b2)     (MCse)       r(b3)     (MCse)      r(b4)     (MCse) |
  |------------------------------------------------------------------------------------------------------------------|
  | Bias in point estimate   .0057378   .0013209   -.3662483   .0015431   -.9999997   4.35e-08   .0203125   .0030157 |
  +------------------------------------------------------------------------------------------------------------------+

So you need some very extensive model checking before you can believe the results from hetprobit.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Tags: None

Announcement

Heteroscedastic Probit vs. Homoscedastic PSM & IV