I run a fixed effects logistic regression of self rated health and local employment change over three waves, as follows (clustering is baseline geographical location):
Code:
. clogit binary_health_y psum_unemployed_total_cont_y i.yrlycurrent_county_y1 i.year age_y i.maritalstatus_
> y if has_y0_questionnaire==1 & has_y5_questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire=
> =1 | has_y0_questionnaire==1 & has_y5_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire=
> =1 & cbmi_y5 !=. & has_y5_questionnaire==0 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionnai
> re==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_y10 !=. & has_y10_question
> naire==0 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionnaire==1 & cb
> mi_y10 !=. & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 &
> cbmi_y10 !=. & has_y10_questionnaire==1, group(id) cluster (current_county_y1) robust iterate(500) nolog
note: multiple positive outcomes within groups encountered.
note: 447 groups (1,057 obs) dropped because of all positive or
all negative outcomes.
note: 3.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 6.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 7.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 13.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 15.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 16.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 17.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 18.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 19.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 20.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 23.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 24.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 25.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 26.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 28.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 29.yrlycurrent_county_y1 omitted because of no within-group variance.
note: 32.yrlycurrent_county_y1 omitted because of no within-group variance.
convergence not achieved
Conditional (fixed-effects) logistic regression
Number of obs = 524
Wald chi2(16) = .
Prob > chi2 = .
Log pseudolikelihood = -178.37962 Pseudo R2 = 0.0592
(Std. Err. adjusted for 20 clusters in current_county_y1)
----------------------------------------------------------------------------------------------
| Robust
binary_health_y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
psum_unemployed_total_cont_y | -.0720918 .0388134 -1.86 0.063 -.1481648 .0039811
|
yrlycurrent_county_y1 |
Cavan | 0 (empty)
Clare | 3.508484 1.437552 2.44 0.015 .6909335 6.326035
Cork | -701.8535 . . . . .
Donegal | 0 (empty)
Dublin 16 | 0 (empty)
Dublin City | 5.209175 1.927489 2.70 0.007 1.431367 8.986984
DĂșn Laoghaire-Rathdown | 5.798566 1.803271 3.22 0.001 2.264218 9.332913
Fingal | 5.31021 2.008511 2.64 0.008 1.373601 9.246819
Galway | .7730753 1.945584 0.40 0.691 -3.040199 4.58635
Galway City | .7289299 . . . . .
Kerry | 0 (empty)
Kildare | .5227628 1.379929 0.38 0.705 -2.181848 3.227374
Kilkenny | 0 (omitted)
Laois | 0 (omitted)
Leitrim | 0 (empty)
Limerick | 0 (empty)
Longford | 0 (empty)
Louth | 0 (empty)
Mayo | -37.39307 . . . . .
Meath | 5.840481 1.762588 3.31 0.001 2.385872 9.295089
Monaghan | 0 (empty)
Offaly | 0 (omitted)
Roscommon | 0 (omitted)
Sligo | 0 (empty)
South Dublin | 5.426697 1.908334 2.84 0.004 1.686432 9.166962
Tipperary | 0 (empty)
Tipperary North | 0 (empty)
Waterford | -559.1875 . . . . .
Westmeath | 3.329881 1.503996 2.21 0.027 .382103 6.277658
Wexford | 0 (omitted)
Wicklow | -97.28333 . . . . .
|
year |
5 | -.4421024 .2411563 -1.83 0.067 -.9147601 .0305553
10 | .2906156 . . . . .
|
age_y | .0180895 .030657 0.59 0.555 -.0419972 .0781762
|
maritalstatus_y |
Cohabiting | .2093029 .1423515 1.47 0.141 -.069701 .4883068
Separated | -.5001542 1.496891 -0.33 0.738 -3.434007 2.433699
Divorced | -1.252438 .6516711 -1.92 0.055 -2.529689 .0248143
Widowed | .5818359 1.804183 0.32 0.747 -2.954298 4.117969
Single/Never married | -.025371 .4241849 -0.06 0.952 -.8567581 .8060161
----------------------------------------------------------------------------------------------
Warning: convergence not achieved
. margins, dydx(psum_unemployed_total_cont_y) post
Average marginal effects Number of obs = 524
Model VCE : Robust
Expression : Pr(binary_health_y|fixed effect is 0), predict(pu0)
dy/dx w.r.t. : psum_unemployed_total_cont_y
----------------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
psum_unemployed_total_cont_y | -.0083921 .0045424 -1.85 0.065 -.0172951 .0005108
----------------------------------------------------------------------------------------------
However, convergence is not achieved. Reviewing the Statalist archives I removed variables one by one until I identified the two culprits, respondents age and geographical location, no longer included below:
Code:
. clogit binary_health_y psum_unemployed_total_cont_y i.year i.maritalstatus_y if has_y0_questionnaire==1 &
> has_y5_questionnaire==1 | has_y0_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 &
> has_y5_questionnaire==1 & has_y10_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_ques
> tionnaire==0 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnaire==
> 1 & cbmi_y5 !=. & has_y5_questionnaire==0 & cbmi_y10 !=. & has_y10_questionnaire==0 | has_y0_questionnair
> e==1 & cbmi_y5 !=. & has_y5_questionnaire==1 | has_y0_questionnaire==1 & cbmi_y10 !=. & has_y10_questionn
> aire==1 | has_y0_questionnaire==1 & cbmi_y5 !=. & has_y5_questionnaire==1 & cbmi_y10 !=. & has_y10_questi
> onnaire==1, group(id) cluster (current_county_y1) robust nolog
note: multiple positive outcomes within groups encountered.
note: 469 groups (1,106 obs) dropped because of all positive or
all negative outcomes.
Conditional (fixed-effects) logistic regression
Number of obs = 547
Wald chi2(8) = 86.50
Prob > chi2 = 0.0000
Log pseudolikelihood = -194.59969 Pseudo R2 = 0.0172
(Std. Err. adjusted for 20 clusters in current_county_y1)
----------------------------------------------------------------------------------------------
| Robust
binary_health_y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
psum_unemployed_total_cont_y | -.0708697 .0309798 -2.29 0.022 -.1315889 -.0101504
|
year |
5 | -.3027965 .1272918 -2.38 0.017 -.5522839 -.053309
10 | .4799054 .2673404 1.80 0.073 -.0440723 1.003883
|
maritalstatus_y |
Cohabiting | .3540121 .2454215 1.44 0.149 -.1270052 .8350295
Separated | -.5968626 1.514624 -0.39 0.694 -3.565471 2.371746
Divorced | -1.213241 .6442842 -1.88 0.060 -2.476015 .0495326
Widowed | -.0610695 1.433941 -0.04 0.966 -2.871542 2.749404
Single/Never married | .080062 .3159233 0.25 0.800 -.5391362 .6992603
----------------------------------------------------------------------------------------------
. margins, dydx(psum_unemployed_total_cont_y) post
Average marginal effects Number of obs = 547
Model VCE : Robust
Expression : Pr(binary_health_y|fixed effect is 0), predict(pu0)
dy/dx w.r.t. : psum_unemployed_total_cont_y
----------------------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
psum_unemployed_total_cont_y | -.0157168 .0055284 -2.84 0.004 -.0265523 -.0048812
----------------------------------------------------------------------------------------------
. estimates store logitmod
. estimates table logitmod, star stats(N r2 r2_a)
------------------------------
Variable | logitmod
-------------+----------------
psum_unemp~y | -.01571676**
-------------+----------------
N | 547
r2 |
r2_a |
------------------------------
legend: * p<0.05; ** p<0.01; *** p<0.001
Code:
. des yrlycurrent_county_y
storage display value
variable name type format label variable label
-----------------------------------------------------------------------------------------------------------
yrlycurrent_c~y str23 %23s
. sum yrlycurrent_county_y
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
yrlycurren~y | 0
Code:
. sum age_y
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
age_y | 3,123 35.06398 7.199812 15.1 53.7
. des age_y
storage display value
variable name type format label variable label
-----------------------------------------------------------------------------------------------------------
age_y float %9.0g
.
I would like to keep these variables in my analysis so what can I do?
Code:
gen int rage_y = round(age_y) gen float fyrlycurrent_county_y1 = yrlycurrent_county_y1 replace age_y=round(age_y, 1) gen age_bands_y=. replace age_bands_y = 1 if age_y>=15 & age_y<= 25 replace age_bands_y = 2 if age_y>=25 & age_y<= 35 replace age_bands_y = 3 if age_y>=45 & age_y<= 55
Thanks for any and all advice,
John

Leave a comment: