I run a fixed effects logistic regression of self rated health and local employment change over three waves, as follows (clustering is baseline geographical location):
Code:
. clogit binary_health_y psum_unemployed_total_cont_y i.yrlycurrent_county_y1 i.year age_y i.maritalstatus_ > y if has_y0_questionnaire==1 & has_y5_questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire= > =1 | has_y0_questionnaire==1 & has_y5_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire= > =1 & cbmi_y5 !=. & has_y5_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionnai > re==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_y10 !=. & has_y10_question > naire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionnaire==1 & cb > mi_y10 !=. & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 & > cbmi_y10 !=. & has_y10_questionnaire==1, group(id) cluster (current_county_y1) robust iterate(500) nolog note: multiple positive outcomes within groups encountered. note: 447 groups (1,057 obs) dropped because of all positive or all negative outcomes. note: 3.yrlycurrent_county_y1 omitted because of no within-group variance. note: 6.yrlycurrent_county_y1 omitted because of no within-group variance. note: 7.yrlycurrent_county_y1 omitted because of no within-group variance. note: 13.yrlycurrent_county_y1 omitted because of no within-group variance. note: 15.yrlycurrent_county_y1 omitted because of no within-group variance. note: 16.yrlycurrent_county_y1 omitted because of no within-group variance. note: 17.yrlycurrent_county_y1 omitted because of no within-group variance. note: 18.yrlycurrent_county_y1 omitted because of no within-group variance. note: 19.yrlycurrent_county_y1 omitted because of no within-group variance. note: 20.yrlycurrent_county_y1 omitted because of no within-group variance. note: 23.yrlycurrent_county_y1 omitted because of no within-group variance. note: 24.yrlycurrent_county_y1 omitted because of no within-group variance. note: 25.yrlycurrent_county_y1 omitted because of no within-group variance. note: 26.yrlycurrent_county_y1 omitted because of no within-group variance. note: 28.yrlycurrent_county_y1 omitted because of no within-group variance. note: 29.yrlycurrent_county_y1 omitted because of no within-group variance. note: 32.yrlycurrent_county_y1 omitted because of no within-group variance. convergence not achieved Conditional (fixed-effects) logistic regression Number of obs = 524 Wald chi2(16) = . Prob > chi2 = . Log pseudolikelihood = -178.37962 Pseudo R2 = 0.0592 (Std. Err. adjusted for 20 clusters in current_county_y1) ---------------------------------------------------------------------------------------------- | Robust binary_health_y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | -.0720918 .0388134 -1.86 0.063 -.1481648 .0039811 | yrlycurrent_county_y1 | Cavan | 0 (empty) Clare | 3.508484 1.437552 2.44 0.015 .6909335 6.326035 Cork | -701.8535 . . . . . Donegal | 0 (empty) Dublin 16 | 0 (empty) Dublin City | 5.209175 1.927489 2.70 0.007 1.431367 8.986984 DĂșn Laoghaire-Rathdown | 5.798566 1.803271 3.22 0.001 2.264218 9.332913 Fingal | 5.31021 2.008511 2.64 0.008 1.373601 9.246819 Galway | .7730753 1.945584 0.40 0.691 -3.040199 4.58635 Galway City | .7289299 . . . . . Kerry | 0 (empty) Kildare | .5227628 1.379929 0.38 0.705 -2.181848 3.227374 Kilkenny | 0 (omitted) Laois | 0 (omitted) Leitrim | 0 (empty) Limerick | 0 (empty) Longford | 0 (empty) Louth | 0 (empty) Mayo | -37.39307 . . . . . Meath | 5.840481 1.762588 3.31 0.001 2.385872 9.295089 Monaghan | 0 (empty) Offaly | 0 (omitted) Roscommon | 0 (omitted) Sligo | 0 (empty) South Dublin | 5.426697 1.908334 2.84 0.004 1.686432 9.166962 Tipperary | 0 (empty) Tipperary North | 0 (empty) Waterford | -559.1875 . . . . . Westmeath | 3.329881 1.503996 2.21 0.027 .382103 6.277658 Wexford | 0 (omitted) Wicklow | -97.28333 . . . . . | year | 5 | -.4421024 .2411563 -1.83 0.067 -.9147601 .0305553 10 | .2906156 . . . . . | age_y | .0180895 .030657 0.59 0.555 -.0419972 .0781762 | maritalstatus_y | Cohabiting | .2093029 .1423515 1.47 0.141 -.069701 .4883068 Separated | -.5001542 1.496891 -0.33 0.738 -3.434007 2.433699 Divorced | -1.252438 .6516711 -1.92 0.055 -2.529689 .0248143 Widowed | .5818359 1.804183 0.32 0.747 -2.954298 4.117969 Single/Never married | -.025371 .4241849 -0.06 0.952 -.8567581 .8060161 ---------------------------------------------------------------------------------------------- Warning: convergence not achieved . margins, dydx(psum_unemployed_total_cont_y) post Average marginal effects Number of obs = 524 Model VCE : Robust Expression : Pr(binary_health_y|fixed effect is 0), predict(pu0) dy/dx w.r.t. : psum_unemployed_total_cont_y ---------------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -----------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | -.0083921 .0045424 -1.85 0.065 -.0172951 .0005108 ----------------------------------------------------------------------------------------------
However, convergence is not achieved. Reviewing the Statalist archives I removed variables one by one until I identified the two culprits, respondents age and geographical location, no longer included below:
Code:
. clogit binary_health_y psum_unemployed_total_cont_y i.year i.maritalstatus_y if has_y0_questionnaire==1 & > has_y5_questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & > has_y5_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_ques > tionnaire==0 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnaire== > 1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnair > e==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionn > aire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 & cbmi_y10 !=. & has_y10_questi > onnaire==1, group(id) cluster (current_county_y1) robust nolog note: multiple positive outcomes within groups encountered. note: 469 groups (1,106 obs) dropped because of all positive or all negative outcomes. Conditional (fixed-effects) logistic regression Number of obs = 547 Wald chi2(8) = 86.50 Prob > chi2 = 0.0000 Log pseudolikelihood = -194.59969 Pseudo R2 = 0.0172 (Std. Err. adjusted for 20 clusters in current_county_y1) ---------------------------------------------------------------------------------------------- | Robust binary_health_y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -----------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | -.0708697 .0309798 -2.29 0.022 -.1315889 -.0101504 | year | 5 | -.3027965 .1272918 -2.38 0.017 -.5522839 -.053309 10 | .4799054 .2673404 1.80 0.073 -.0440723 1.003883 | maritalstatus_y | Cohabiting | .3540121 .2454215 1.44 0.149 -.1270052 .8350295 Separated | -.5968626 1.514624 -0.39 0.694 -3.565471 2.371746 Divorced | -1.213241 .6442842 -1.88 0.060 -2.476015 .0495326 Widowed | -.0610695 1.433941 -0.04 0.966 -2.871542 2.749404 Single/Never married | .080062 .3159233 0.25 0.800 -.5391362 .6992603 ---------------------------------------------------------------------------------------------- . margins, dydx(psum_unemployed_total_cont_y) post Average marginal effects Number of obs = 547 Model VCE : Robust Expression : Pr(binary_health_y|fixed effect is 0), predict(pu0) dy/dx w.r.t. : psum_unemployed_total_cont_y ---------------------------------------------------------------------------------------------- | Delta-method | dy/dx Std. Err. z P>|z| [95% Conf. Interval] -----------------------------+---------------------------------------------------------------- psum_unemployed_total_cont_y | -.0157168 .0055284 -2.84 0.004 -.0265523 -.0048812 ---------------------------------------------------------------------------------------------- . estimates store logitmod . estimates table logitmod, star stats(N r2 r2_a) ------------------------------ Variable | logitmod -------------+---------------- psum_unemp~y | -.01571676** -------------+---------------- N | 547 r2 | r2_a | ------------------------------ legend: * p<0.05; ** p<0.01; *** p<0.001
Code:
. des yrlycurrent_county_y storage display value variable name type format label variable label ----------------------------------------------------------------------------------------------------------- yrlycurrent_c~y str23 %23s . sum yrlycurrent_county_y Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- yrlycurren~y | 0
Code:
. sum age_y Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- age_y | 3,123 35.06398 7.199812 15.1 53.7 . des age_y storage display value variable name type format label variable label ----------------------------------------------------------------------------------------------------------- age_y float %9.0g .
I would like to keep these variables in my analysis so what can I do?
Code:
gen int rage_y = round(age_y) gen float fyrlycurrent_county_y1 = yrlycurrent_county_y1 replace age_y=round(age_y, 1) gen age_bands_y=. replace age_bands_y = 1 if age_y>=15 & age_y<= 25 replace age_bands_y = 2 if age_y>=25 & age_y<= 35 replace age_bands_y = 3 if age_y>=45 & age_y<= 55
Thanks for any and all advice,
John
Leave a comment: