Dear Stata experts,
I have questions about running a panel logistic regression with interaction terms and reporting the results. I am using stata 17.0.
My dataset has two years (2019 and 2020) of observations and recorded at the individual level (cpsidp) nested to the family level (cpsid). I did xtset cpsidp year. Here's a sample of my data.
I am interested to see how ratio of disability income in family income (dis_inc_ratio) affects the family's probability of being in poverty (fam_pov: 1 if in poverty and 0 otherwise) in pre- and during the pandemic (pandemic) and to see if there are any different effects in certain sub groups (households with Non Hispanic Black householder or female householder). Additionally, I want to see if family structures matter, so I split my sample into two groups: two headed and single headed households. To estimate this, I am running a fixed effects logistic model with the following commands:
1. Two-headed full sample
2. Two-headed households with Non Hispanic Black householders
3. Two-headed households with Female householders
I'm running the same codes above separately for single headed households.
My questions are the following:
1. How do I report my results of odds ratios of the triple interaction terms? I understand from other posts here and other sources that -margins- after -xtlogit, fe- can be problematic because margins by default gives you "the probability of a positive outcome assuming that fixed effect is zero (https://www3.nd.edu/~rwilliam/Taiwan2018/FixedEffects.pdf)." Then, for example in the following result, how could I report the effects of having a Non Hispanic Black householder on the probability of being in poverty? Should I just add all the odds ratio of triple interaction effects (3.18 + 0.87 + 0.62 + 254955.8 + 1.29 + 1.91 + 0.16)?
2. I have triple interaction terms to estimate heterogenous effects of the sub groups. But I'm wondering if I should just run the regressions with sub samples of the sub groups with a double interaction term. So for example, the command #2 above would be
. This is just to see if I am running proper models for my study. However, the results from this option gives a small sample like the following:
Which model would be more appropriate for my study, regressions with triple interaction terms or a set of small sub samples for the sub groups?
I really appreciate any comments in advance. Thank you so much!
I have questions about running a panel logistic regression with interaction terms and reporting the results. I am using stata 17.0.
My dataset has two years (2019 and 2020) of observations and recorded at the individual level (cpsidp) nested to the family level (cpsid). I did xtset cpsidp year. Here's a sample of my data.
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float(fam_pov dis_inc_ratio pandemic hh_nhisblack hh_children hh_age hh_bachelor hh_emp famemp_two) int year 0 0 0 1 0 68 0 0 3 2019 0 0 1 1 0 69 0 0 6 2020 0 0 0 1 0 68 0 0 3 2019 0 0 1 1 0 69 0 0 6 2020 0 1 0 1 0 57 0 0 . 2019 0 .99965 1 1 0 58 0 0 . 2020 0 0 0 0 0 58 0 0 3 2019 0 0 1 0 0 59 0 0 3 2020 0 0 0 0 0 58 0 0 3 2019 0 0 1 0 0 59 0 0 3 2020 end label values hh_children hh_children label def hh_children 0 "No children", modify label values hh_bachelor education label def education 0 "Less than bachelor's", modify label values hh_emp employment label def employment 0 "Not working", modify label values famemp_two famemptwo label def famemptwo 3 "One full time, one nonworker", modify label def famemptwo 6 "Two nonworking", modify
I am interested to see how ratio of disability income in family income (dis_inc_ratio) affects the family's probability of being in poverty (fam_pov: 1 if in poverty and 0 otherwise) in pre- and during the pandemic (pandemic) and to see if there are any different effects in certain sub groups (households with Non Hispanic Black householder or female householder). Additionally, I want to see if family structures matter, so I split my sample into two groups: two headed and single headed households. To estimate this, I am running a fixed effects logistic model with the following commands:
1. Two-headed full sample
Code:
xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 , fe vce(oim)
Code:
xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_nhisblack hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 , fe vce(oim)
Code:
xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_female hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1, fe vce(oim)
My questions are the following:
1. How do I report my results of odds ratios of the triple interaction terms? I understand from other posts here and other sources that -margins- after -xtlogit, fe- can be problematic because margins by default gives you "the probability of a positive outcome assuming that fixed effect is zero (https://www3.nd.edu/~rwilliam/Taiwan2018/FixedEffects.pdf)." Then, for example in the following result, how could I report the effects of having a Non Hispanic Black householder on the probability of being in poverty? Should I just add all the odds ratio of triple interaction effects (3.18 + 0.87 + 0.62 + 254955.8 + 1.29 + 1.91 + 0.16)?
Code:
qui xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_nhisblack hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 xtlogit, or Conditional fixed-effects logistic regression Number of obs = 3,424 Group variable: cpsidp Number of groups = 1,712 Obs per group: min = 2 avg = 2.0 max = 2 LR chi2(16) = 101.50 Log likelihood = -1135.9168 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------------------------------- fam_pov | Odds ratio Std. err. z P>|z| [95% conf. interval] --------------------------------------+---------------------------------------------------------------- dis_inc_ratio | 3.175262 .9580835 3.83 0.000 1.757695 5.736088 1.pandemic | .8659488 .0532525 -2.34 0.019 .7676206 .9768723 | pandemic#c.dis_inc_ratio | 1 | .6193452 .1914241 -1.55 0.121 .3379463 1.135058 | 1.hh_nhisblack | 254955.8 2.01e+08 0.02 0.987 0 . | hh_nhisblack#c.dis_inc_ratio | 1 | 1.28513 .9157518 0.35 0.725 .3179818 5.193883 | pandemic#hh_nhisblack | 1 1 | 1.912265 .3619197 3.43 0.001 1.319616 2.771076 | pandemic#hh_nhisblack#c.dis_inc_ratio | 1 1 | .1640642 .1498309 -1.98 0.048 .0273942 .982584 | hh_children | 2.54656 2.033707 1.17 0.242 .5323246 12.18236 hh_age | .9850302 .030007 -0.50 0.621 .9279388 1.045634 hh_bachelor | 1.166594 .3206913 0.56 0.575 .6806571 1.999453 hh_emp | 2.335641 .4686148 4.23 0.000 1.576246 3.460893 | famemp_two | One full time, one part time | 2.359379 .4297561 4.71 0.000 1.651022 3.371649 One full time, one nonworker | 1.807077 .3401693 3.14 0.002 1.249523 2.61342 Two part time workers | 2.480789 .7026637 3.21 0.001 1.423947 4.322011 One part-time, one non-worker | 2.652368 .6501041 3.98 0.000 1.640596 4.288112 Two nonworking | 4.180847 1.143527 5.23 0.000 2.445949 7.1463 ------------------------------------------------------------------------------------------------------- .
Code:
xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 & hh_nhisblack==1 , fe vce(oim)
Code:
qui xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 & hh_nhisblack==1 , fe vce(oim) xtlogit, or Conditional fixed-effects logistic regression Number of obs = 342 Group variable: cpsidp Number of groups = 171 Obs per group: min = 2 avg = 2.0 max = 2 LR chi2(10) = 59.07 Log likelihood = -88.99181 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------------------------ fam_pov | Odds ratio Std. err. z P>|z| [95% conf. interval] -------------------------------+---------------------------------------------------------------- dis_inc_ratio | 8.140545 5.8601 2.91 0.004 1.98565 33.3737 1.pandemic | 4.777616 2.194409 3.40 0.001 1.941983 11.75377 | pandemic#c.dis_inc_ratio | 1 | .0676488 .0621612 -2.93 0.003 .0111714 .4096497 | hh_children | 1 (omitted) hh_age | .4758893 .1906921 -1.85 0.064 .216982 1.04373 hh_bachelor | 4.12e-08 .0000336 -0.02 0.983 0 . hh_emp | 1.870268 1.207954 0.97 0.332 .5273956 6.632408 | famemp_two | One full time, one part time | 1.401092 .8178162 0.58 0.563 .4462943 4.398577 One full time, one nonworker | .5282689 .2632813 -1.28 0.200 .1988974 1.403076 Two part time workers | 1 (empty) One part-time, one non-worker | 1.35e-08 .0000296 -0.01 0.993 0 . Two nonworking | .2626393 .2036896 -1.72 0.085 .0574396 1.200903 ------------------------------------------------------------------------------------------------
Which model would be more appropriate for my study, regressions with triple interaction terms or a set of small sub samples for the sub groups?
I really appreciate any comments in advance. Thank you so much!
Comment