Hi Statalist,
To tackle issues of endogeneity through omitted variables and reverse causality, I am using two stage least squares regression (2SLS) to conduct my analysis. With the help of previous advice given on here this is the stata code so far:
Dependent variable (binary): emotional
Endogenous variable (binary): currwork
Instrumental variable (binary): cheb
I estimate the models using these commands which give identical results:
and
I am slightly confused as I have read comments on statalist suggesting that you should not estimate the first stage using logistic/probit regression but instead using linear regression in the first stage because “in 2SLS the consistency of the estimates in the second stage are not dependent upon specifying the correct functional form in the first stage”.
My question is am I estimating these models correctly? Or is this the “forbidden regression”?
Regression output:
To tackle issues of endogeneity through omitted variables and reverse causality, I am using two stage least squares regression (2SLS) to conduct my analysis. With the help of previous advice given on here this is the stata code so far:
Dependent variable (binary): emotional
Endogenous variable (binary): currwork
Instrumental variable (binary): cheb
I estimate the models using these commands which give identical results:
Code:
probit currwork i.husjob attitude i.prevdv i.educgap i.educlvl agegap age agefrstmar i.religion i.urban i.geo_eg1988_2014 nsons i.hhkidlt5 i.wealthq i.year i.cheb predict workhat, xb ivregress 2sls emotional i.husjob attitude i.prevdv i.educgap i.educlvl agegap age agefrstmar i.religion i.urban i.geo_eg1988_2014 nsons i.hhkidlt5 i.wealthq i.year (i.currwork=i.cheb workhat)
Code:
ivreg2 emotional (i.currwork=i.cheb) i.husjob attitude i.prevdv i.educgap i.educlvl agegap age agefrstmar i.religion i.urban i.geo_eg1988_2014 nsons i.hhkidlt5 i.wealthq i.year, first
My question is am I estimating these models correctly? Or is this the “forbidden regression”?
Regression output:
Code:
First-stage regressions ----------------------- First-stage regression of 1.currwork: Statistics consistent for homoskedasticity only Number of obs = 9691 ---------------------------------------------------------------------------------------- 1.currwork | Coef. Std. Err. t P>|t| [95% Conf. Interval] -----------------------+---------------------------------------------------------------- 1.cheb | .0623816 .0188212 3.31 0.001 .0254881 .099275 | husjob | White collar job | .0794522 .026904 2.95 0.003 .0267147 .1321898 Blue collar job | .0482838 .0264973 1.82 0.068 -.0036565 .1002242 | attitude | -.0011089 .0025856 -0.43 0.668 -.0061773 .0039595 1.prevdv | .000505 .0101112 0.05 0.960 -.019315 .020325 | educgap | Wife better educated | -.0717708 .0193309 -3.71 0.000 -.1096633 -.0338782 Both equally educated | -.0217123 .0160946 -1.35 0.177 -.0532611 .0098365 | educlvl | primary | -.0486817 .0132463 -3.68 0.000 -.0746473 -.0227161 secondary | .0835031 .018246 4.58 0.000 .0477372 .119269 higher | .3439955 .0241975 14.22 0.000 .2965633 .3914276 | agegap | -.0017554 .0008271 -2.12 0.034 -.0033767 -.0001341 age | .0081049 .0007404 10.95 0.000 .0066536 .0095562 agefrstmar | .0056594 .0011651 4.86 0.000 .0033755 .0079433 | religion | christian | -.0074209 .01782 -0.42 0.677 -.0423518 .0275101 | urban | rural | .0310775 .0113583 2.74 0.006 .008813 .0533421 | geo_eg1988_2014 | lower egypt | .0509753 .0125685 4.06 0.000 .0263384 .0756121 upper egypt | .0183381 .0125682 1.46 0.145 -.0062982 .0429743 frontier governorates | .0390965 .0182698 2.14 0.032 .0032839 .0749091 | nsons | -.0105289 .0040027 -2.63 0.009 -.0183749 -.0026828 | hhkidlt5 | 1 | -.0249247 .0110358 -2.26 0.024 -.0465571 -.0032923 2 | -.039848 .0131426 -3.03 0.002 -.0656102 -.0140859 3+ | -.0166016 .0198564 -0.84 0.403 -.0555242 .0223211 | wealthq | poorer | -.0595622 .012753 -4.67 0.000 -.0845607 -.0345637 middle | -.0659929 .0132747 -4.97 0.000 -.092014 -.0399717 richer | -.0514859 .0147724 -3.49 0.000 -.080443 -.0225289 richest | -.0389741 .0173162 -2.25 0.024 -.0729173 -.0050308 | year | 2014 | -.1019067 .007726 -13.19 0.000 -.1170512 -.0867622 | _cons | -.2760092 .0455071 -6.07 0.000 -.3652125 -.1868058 ---------------------------------------------------------------------------------------- F test of excluded instruments: F( 1, 9663) = 10.99 Prob > F = 0.0009 Sanderson-Windmeijer multivariate F test of excluded instruments: F( 1, 9663) = 10.99 Prob > F = 0.0009 Summary results for first-stage regressions ------------------------------------------- (Underid) (Weak id) Variable | F( 1, 9663) P-val | SW Chi-sq( 1) P-val | SW F( 1, 9663) 1.currwork | 10.99 0.0009 | 11.02 0.0009 | 10.99 Stock-Yogo weak ID F test critical values for single endogenous regressor: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. NB: Critical values are for Sanderson-Windmeijer F statistic. Underidentification test Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified) Ha: matrix has rank=K1 (identified) Anderson canon. corr. LM statistic Chi-sq(1)=11.00 P-val=0.0009 Weak identification test Ho: equation is weakly identified Cragg-Donald Wald F statistic 10.99 Stock-Yogo weak ID test critical values for K1=1 and L1=1: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. Weak-instrument-robust inference Tests of joint significance of endogenous regressors B1 in main equation Ho: B1=0 and orthogonality conditions are valid Anderson-Rubin Wald test F(1,9663)= 12.88 P-val=0.0003 Anderson-Rubin Wald test Chi-sq(1)= 12.92 P-val=0.0003 Stock-Wright LM S statistic Chi-sq(1)= 12.90 P-val=0.0003 Number of observations N = 9691 Number of regressors K = 28 Number of endogenous regressors K1 = 1 Number of instruments L = 28 Number of excluded instruments L1 = 1 IV (2SLS) estimation -------------------- Estimates efficient for homoskedasticity only Statistics consistent for homoskedasticity only Number of obs = 9691 F( 27, 9663) = 7.87 Prob > F = 0.0000 Total (centered) SS = 1341.189145 Centered R2 = -0.9841 Total (uncentered) SS = 1608 Uncentered R2 = -0.6549 Residual SS = 2661.085206 Root MSE = .524 ---------------------------------------------------------------------------------------- emotional | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------------+---------------------------------------------------------------- currwork | yes | 1.082058 .433683 2.50 0.013 .2320545 1.932061 | husjob | White collar job | -.0319561 .0526654 -0.61 0.544 -.1351783 .0712662 Blue collar job | .0248605 .044106 0.56 0.573 -.0615856 .1113067 | attitude | .0163529 .0037506 4.36 0.000 .0090018 .023704 1.prevdv | .1211017 .0145366 8.33 0.000 .0926104 .149593 | educgap | Wife better educated | .149971 .0418604 3.58 0.000 .0679261 .2320159 Both equally educated | .0727311 .0249662 2.91 0.004 .0237984 .1216639 | educlvl | primary | .0611606 .0281502 2.17 0.030 .0059872 .1163339 secondary | -.1788748 .0454536 -3.94 0.000 -.2679623 -.0897873 higher | -.504343 .1539387 -3.28 0.001 -.8060573 -.2026286 | agegap | .0007305 .001412 0.52 0.605 -.002037 .003498 age | -.0077304 .004027 -1.92 0.055 -.0156232 .0001624 agefrstmar | -.0084541 .0026322 -3.21 0.001 -.0136132 -.0032951 | religion | christian | -.0278122 .0257252 -1.08 0.280 -.0782327 .0226083 | urban | rural | -.0739228 .0208361 -3.55 0.000 -.1147608 -.0330847 | geo_eg1988_2014 | lower egypt | -.0476695 .0286722 -1.66 0.096 -.1038659 .0085269 upper egypt | -.0334369 .0193555 -1.73 0.084 -.071373 .0044993 frontier governorates | -.0753515 .0308496 -2.44 0.015 -.1358156 -.0148873 | nsons | .0068156 .0068349 1.00 0.319 -.0065805 .0202117 | hhkidlt5 | 1 | .0111388 .0145799 0.76 0.445 -.0174373 .0397149 2 | .0507056 .0192942 2.63 0.009 .0128897 .0885215 3+ | .0227837 .0273771 0.83 0.405 -.0308745 .0764419 | wealthq | poorer | .0596156 .0315742 1.89 0.059 -.0022687 .1215 middle | .073244 .0343015 2.14 0.033 .0060144 .1404736 richer | .0450257 .0307738 1.46 0.143 -.0152898 .1053412 richest | .0089513 .0301276 0.30 0.766 -.0500976 .0680003 | year | 2014 | .1408985 .0452166 3.12 0.002 .0522756 .2295214 | _cons | .3966987 .1248004 3.18 0.001 .1520943 .641303 ---------------------------------------------------------------------------------------- Underidentification test (Anderson canon. corr. LM statistic): 11.005 Chi-sq(1) P-val = 0.0009 ------------------------------------------------------------------------------ Weak identification test (Cragg-Donald Wald F statistic): 10.985 Stock-Yogo weak ID test critical values: 10% maximal IV size 16.38 15% maximal IV size 8.96 20% maximal IV size 6.66 25% maximal IV size 5.53 Source: Stock-Yogo (2005). Reproduced by permission. ------------------------------------------------------------------------------ Sargan statistic (overidentification test of all instruments): 0.000 (equation exactly identified) ------------------------------------------------------------------------------ Instrumented: 1.currwork Included instruments: 1.husjob 2.husjob attitude 1.prevdv 2.educgap 3.educgap 1.educlvl 2.educlvl 3.educlvl agegap age agefrstmar 1.religion 2.urban 2.geo_eg1988_2014 3.geo_eg1988_2014 4.geo_eg1988_2014 nsons 1.hhkidlt5 2.hhkidlt5 3.hhkidlt5 2.wealthq 3.wealthq 4.wealthq 5.wealthq 2014.year Excluded instruments: 1.cheb ------------------------------------------------------------------------------
Comment