I have a panel of mothers over several time points, I am looking at year five and year ten in my regression below to determine if increases in obesity are due to increases in unemployment.
In my analysis I employ a linear probability model (lpm) in a random effects model to look at the relationship between changes in local area unemployment (psum_unemployed_total_cont_y) and obesity (binbmi_obese).
I also estimate a logit model (with margins) to confirm that the linear probability model is acceptable.
Something that I have noticed is that, although my coefficients are often similar in magnitude between models, the logit model is often less significant than the lpm, i.e. usually the z-stat is larger than 1.65 but just below 1.96.
This is a characteristic across almost all of my models, but I include an example below to illustrate my point.
My regression models are as below:
1. The logit model:
2. The LPM Model:
I assume that the differences are due to the fact that the LPM model is clustered at the local area that the individual lives in (as this is the level at which I measure their unemployment) and the fact that the LPM model uses the robust option.
Thus, in justifying the differences observed between the psum_unemployed coefficient in the two models my instinct is to say something of the order that this can be attributed to the fact that the clustered standard errors which were applied in the LPM model could not be applied to the RE logit estimator, as this estimator is inconsistent in the presence of serial correlation (and heteroskedasticity) (this is in according to the PhD these of Do Wan Kwak, qouted by @Jeff Wooldridge here:https://www.statalist.org/forums/for...tandard-errors).
I also see that there is a slight problem of perfect prediction here, I wonder if this could also be adding to these differences? As the logit model is predominantly to support my use of the LPM model, do you think this is something I should worry about and is there anything I can do to handle it?
Grateful for any thoughts.
Very best,
John
In my analysis I employ a linear probability model (lpm) in a random effects model to look at the relationship between changes in local area unemployment (psum_unemployed_total_cont_y) and obesity (binbmi_obese).
I also estimate a logit model (with margins) to confirm that the linear probability model is acceptable.
Something that I have noticed is that, although my coefficients are often similar in magnitude between models, the logit model is often less significant than the lpm, i.e. usually the z-stat is larger than 1.65 but just below 1.96.
This is a characteristic across almost all of my models, but I include an example below to illustrate my point.
My regression models are as below:
1. The logit model:
Code:
. * Obese: . . * Logit: . . xtlogit binbmi_obese_y psum_unemployed_total_cont_y i.own_education_y i.maritalstatus_y i.medical_card_y i.employment_y i.ord_age_y if gender==0, re nolog note: 2.own_education_y != 0 predicts failure perfectly 2.own_education_y dropped and 1 obs not used note: 5.employment_y != 0 predicts failure perfectly 5.employment_y dropped and 3 obs not used note: 6.own_education_y omitted because of collinearity Random-effects logistic regression Number of obs = 644 Group variable: id Number of groups = 468 Random effects u_i ~ Gaussian Obs per group: min = 1 avg = 1.4 max = 2 Integration method: mvaghermite Integration pts. = 12 Wald chi2(18) = 38.79 Log likelihood = -244.39807 Prob > chi2 = 0.0030 ----------------------------------------------------------------------------------------------------------------------------- binbmi_obese_y | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------------------------------------------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | .2192371 .0629466 3.48 0.000 .0958641 .3426101 | own_education_y | Primary school education | 0 (empty) Some secondary school | 2.663921 1.342674 1.98 0.047 .0323287 5.295514 Complete secondary education | 1.275665 .9904268 1.29 0.198 -.6655358 3.216866 Some third level education at college, university, RTC | 2.306741 1.111069 2.08 0.038 .1290862 4.484396 Complete third level education at college, university, RTC | 0 (omitted) | maritalstatus_y | Cohabiting | -2.320489 1.860596 -1.25 0.212 -5.967191 1.326213 Separated | .6197481 3.457853 0.18 0.858 -6.157519 7.397015 Divorced | 2.169449 4.667265 0.46 0.642 -6.978222 11.31712 Widowed | -.1380215 5.929293 -0.02 0.981 -11.75922 11.48318 Single/Never married | -3.850925 2.827318 -1.36 0.173 -9.392367 1.690516 | medical_card_y | Yes | -.0629122 .9992572 -0.06 0.950 -2.02142 1.895596 | employment_y | Unemployed | .7574163 2.422145 0.31 0.755 -3.9899 5.504732 Unable to work owing to permanent sickness or disability | 15.016 3.609077 4.16 0.000 7.942335 22.08966 At school/student | -8.785242 3.742243 -2.35 0.019 -16.1199 -1.45058 Seeking work for the first time | 0 (empty) Employed | -.3359071 .8523104 -0.39 0.693 -2.006405 1.334591 Self Employed | .4553743 1.412773 0.32 0.747 -2.313609 3.224358 | ord_age_y | 24-27 | -1.491118 3.713545 -0.40 0.688 -8.769533 5.787297 28-32 | -4.069408 3.860242 -1.05 0.292 -11.63534 3.496527 33 + | -5.115289 3.929976 -1.30 0.193 -12.8179 2.587322 | _cons | -8.224994 3.972662 -2.07 0.038 -16.01127 -.4387195 ------------------------------------------------------------+---------------------------------------------------------------- /lnsig2u | 4.624441 .152662 4.325229 4.923653 ------------------------------------------------------------+---------------------------------------------------------------- sigma_u | 10.09682 .7707005 8.693839 11.72621 rho | .9687381 .0046233 .9582889 .9766335 ----------------------------------------------------------------------------------------------------------------------------- LR test of rho=0: chibar2(01) = 50.28 Prob >= chibar2 = 0.000 . margins if gender==0, dydx(psum_unemployed_total_cont_y) post Average marginal effects Number of obs = 644 Model VCE : OIM Expression : Pr(binbmi_obese_y=1), predict(pr) dy/dx w.r.t. : psum_unemployed_total_cont_y ---------------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -----------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | .002648 .0013722 1.93 0.054 -.0000414 .0053374 ----------------------------------------------------------------------------------------------
2. The LPM Model:
Code:
. * LPM: . . xtreg binbmi_obese_y psum_unemployed_total_cont_y i.own_education_y i.maritalstatus_y i.medical_card_y i.employment_y i.ord_age_y if gender==0, cluster (current_count > y_y1) re robust Random-effects GLS regression Number of obs = 648 Group variable: id Number of groups = 470 R-sq: Obs per group: within = 0.1308 min = 1 between = 0.0278 avg = 1.4 overall = 0.0408 max = 2 Wald chi2(20) = 3161.46 corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000 (Std. Err. adjusted for 25 clusters in current_county_y1) ----------------------------------------------------------------------------------------------------------------------------- | Robust binbmi_obese_y | Coef. Std. Err. z P>|z| [95% Conf. Interval] ------------------------------------------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | .0056285 .0016592 3.39 0.001 .0023764 .0088805 | own_education_y | Some secondary school | .2865854 .0458478 6.25 0.000 .1967254 .3764454 Complete secondary education | .2297725 .0361645 6.35 0.000 .1588915 .3006536 Some third level education at college, university, RTC | .2824864 .0569089 4.96 0.000 .170947 .3940258 Complete third level education at college, university, RTC | .1690297 .0233981 7.22 0.000 .1231702 .2148892 | maritalstatus_y | Cohabiting | -.0798641 .0448935 -1.78 0.075 -.1678538 .0081256 Separated | .0263302 .1272542 0.21 0.836 -.2230834 .2757438 Divorced | .0641585 .1679497 0.38 0.702 -.2650169 .3933338 Widowed | .0472516 .1193659 0.40 0.692 -.1867013 .2812046 Single/Never married | -.1314298 .0588051 -2.24 0.025 -.2466857 -.016174 | medical_card_y | Yes | .0011589 .0358399 0.03 0.974 -.069086 .0714037 | employment_y | Unemployed | .044326 .0615108 0.72 0.471 -.0762329 .1648849 Unable to work owing to permanent sickness or disability | .5130424 .1505187 3.41 0.001 .2180311 .8080537 At school/student | -.1526972 .0778524 -1.96 0.050 -.305285 -.0001094 Seeking work for the first time | -.0947593 .0895379 -1.06 0.290 -.2702503 .0807317 Employed | -.0113248 .0164682 -0.69 0.492 -.0436018 .0209523 Self Employed | .0148456 .0510321 0.29 0.771 -.0851755 .1148667 | ord_age_y | 24-27 | -.0575039 .1597636 -0.36 0.719 -.3706348 .2556271 28-32 | -.1455253 .153839 -0.95 0.344 -.4470441 .1559935 33 + | -.1774199 .147281 -1.20 0.228 -.4660853 .1112455 | _cons | .0686216 .1396422 0.49 0.623 -.205072 .3423153 ------------------------------------------------------------+---------------------------------------------------------------- sigma_u | .30869174 sigma_e | .21519403 rho | .67296061 (fraction of variance due to u_i) -----------------------------------------------------------------------------------------------------------------------------
Thus, in justifying the differences observed between the psum_unemployed coefficient in the two models my instinct is to say something of the order that this can be attributed to the fact that the clustered standard errors which were applied in the LPM model could not be applied to the RE logit estimator, as this estimator is inconsistent in the presence of serial correlation (and heteroskedasticity) (this is in according to the PhD these of Do Wan Kwak, qouted by @Jeff Wooldridge here:https://www.statalist.org/forums/for...tandard-errors).
I also see that there is a slight problem of perfect prediction here, I wonder if this could also be adding to these differences? As the logit model is predominantly to support my use of the LPM model, do you think this is something I should worry about and is there anything I can do to handle it?
Grateful for any thoughts.
Very best,
John
Comment