Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel logistic regression with triple interaction terms

    Dear Stata experts,

    I have questions about running a panel logistic regression with interaction terms and reporting the results. I am using stata 17.0.

    My dataset has two years (2019 and 2020) of observations and recorded at the individual level (cpsidp) nested to the family level (cpsid). I did xtset cpsidp year. Here's a sample of my data.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(fam_pov dis_inc_ratio pandemic hh_nhisblack hh_children hh_age hh_bachelor hh_emp famemp_two) int year
    0      0 0 1 0 68 0 0 3 2019
    0      0 1 1 0 69 0 0 6 2020
    0      0 0 1 0 68 0 0 3 2019
    0      0 1 1 0 69 0 0 6 2020
    0      1 0 1 0 57 0 0 . 2019
    0 .99965 1 1 0 58 0 0 . 2020
    0      0 0 0 0 58 0 0 3 2019
    0      0 1 0 0 59 0 0 3 2020
    0      0 0 0 0 58 0 0 3 2019
    0      0 1 0 0 59 0 0 3 2020
    end
    label values hh_children hh_children
    label def hh_children 0 "No children", modify
    label values hh_bachelor education
    label def education 0 "Less than bachelor's", modify
    label values hh_emp employment
    label def employment 0 "Not working", modify
    label values famemp_two famemptwo
    label def famemptwo 3 "One full time, one nonworker", modify
    label def famemptwo 6 "Two nonworking", modify


    I am interested to see how ratio of disability income in family income (dis_inc_ratio) affects the family's probability of being in poverty (fam_pov: 1 if in poverty and 0 otherwise) in pre- and during the pandemic (pandemic) and to see if there are any different effects in certain sub groups (households with Non Hispanic Black householder or female householder). Additionally, I want to see if family structures matter, so I split my sample into two groups: two headed and single headed households. To estimate this, I am running a fixed effects logistic model with the following commands:

    1. Two-headed full sample
    Code:
    xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 , fe vce(oim)
    2. Two-headed households with Non Hispanic Black householders
    Code:
    xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_nhisblack hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 , fe vce(oim)
    3. Two-headed households with Female householders
    Code:
    xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_female hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1, fe vce(oim)
    I'm running the same codes above separately for single headed households.



    My questions are the following:

    1. How do I report my results of odds ratios of the triple interaction terms? I understand from other posts here and other sources that -margins- after -xtlogit, fe- can be problematic because margins by default gives you "the probability of a positive outcome assuming that fixed effect is zero (https://www3.nd.edu/~rwilliam/Taiwan2018/FixedEffects.pdf)." Then, for example in the following result, how could I report the effects of having a Non Hispanic Black householder on the probability of being in poverty? Should I just add all the odds ratio of triple interaction effects (3.18 + 0.87 + 0.62 + 254955.8 + 1.29 + 1.91 + 0.16)?

    Code:
    qui xtlogit fam_pov c.dis_inc_ratio##i.pandemic##i.hh_nhisblack hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1
    xtlogit, or
    
    Conditional fixed-effects logistic regression        Number of obs    =  3,424
    Group variable: cpsidp                               Number of groups =  1,712
    
                                                         Obs per group:
                                                                      min =      2
                                                                      avg =    2.0
                                                                      max =      2
    
                                                         LR chi2(16)      = 101.50
    Log likelihood = -1135.9168                          Prob > chi2      = 0.0000
    
    -------------------------------------------------------------------------------------------------------
                                  fam_pov | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    --------------------------------------+----------------------------------------------------------------
                            dis_inc_ratio |   3.175262   .9580835     3.83   0.000     1.757695    5.736088
                               1.pandemic |   .8659488   .0532525    -2.34   0.019     .7676206    .9768723
                                          |
                 pandemic#c.dis_inc_ratio |
                                       1  |   .6193452   .1914241    -1.55   0.121     .3379463    1.135058
                                          |
                           1.hh_nhisblack |   254955.8   2.01e+08     0.02   0.987            0           .
                                          |
             hh_nhisblack#c.dis_inc_ratio |
                                       1  |    1.28513   .9157518     0.35   0.725     .3179818    5.193883
                                          |
                    pandemic#hh_nhisblack |
                                     1 1  |   1.912265   .3619197     3.43   0.001     1.319616    2.771076
                                          |
    pandemic#hh_nhisblack#c.dis_inc_ratio |
                                     1 1  |   .1640642   .1498309    -1.98   0.048     .0273942     .982584
                                          |
                              hh_children |    2.54656   2.033707     1.17   0.242     .5323246    12.18236
                                   hh_age |   .9850302    .030007    -0.50   0.621     .9279388    1.045634
                              hh_bachelor |   1.166594   .3206913     0.56   0.575     .6806571    1.999453
                                   hh_emp |   2.335641   .4686148     4.23   0.000     1.576246    3.460893
                                          |
                               famemp_two |
            One full time, one part time  |   2.359379   .4297561     4.71   0.000     1.651022    3.371649
            One full time, one nonworker  |   1.807077   .3401693     3.14   0.002     1.249523     2.61342
                   Two part time workers  |   2.480789   .7026637     3.21   0.001     1.423947    4.322011
           One part-time, one non-worker  |   2.652368   .6501041     3.98   0.000     1.640596    4.288112
                          Two nonworking  |   4.180847   1.143527     5.23   0.000     2.445949      7.1463
    -------------------------------------------------------------------------------------------------------
    
    
    .
    2. I have triple interaction terms to estimate heterogenous effects of the sub groups. But I'm wondering if I should just run the regressions with sub samples of the sub groups with a double interaction term. So for example, the command #2 above would be
    Code:
    xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 & hh_nhisblack==1 , fe vce(oim)
    . This is just to see if I am running proper models for my study. However, the results from this option gives a small sample like the following:


    Code:
    qui xtlogit fam_pov c.dis_inc_ratio##i.pandemic hh_children hh_age hh_bachelor hh_emp i.famemp_two if twoheaded==1 & hh_nhisblack==1 , fe vce(oim)
    xtlogit, or
    
    Conditional fixed-effects logistic regression        Number of obs    =    342
    Group variable: cpsidp                               Number of groups =    171
    
                                                         Obs per group:
                                                                      min =      2
                                                                      avg =    2.0
                                                                      max =      2
    
                                                         LR chi2(10)      =  59.07
    Log likelihood = -88.99181                           Prob > chi2      = 0.0000
    
    ------------------------------------------------------------------------------------------------
                           fam_pov | Odds ratio   Std. err.      z    P>|z|     [95% conf. interval]
    -------------------------------+----------------------------------------------------------------
                     dis_inc_ratio |   8.140545     5.8601     2.91   0.004      1.98565     33.3737
                        1.pandemic |   4.777616   2.194409     3.40   0.001     1.941983    11.75377
                                   |
          pandemic#c.dis_inc_ratio |
                                1  |   .0676488   .0621612    -2.93   0.003     .0111714    .4096497
                                   |
                       hh_children |          1  (omitted)
                            hh_age |   .4758893   .1906921    -1.85   0.064      .216982     1.04373
                       hh_bachelor |   4.12e-08   .0000336    -0.02   0.983            0           .
                            hh_emp |   1.870268   1.207954     0.97   0.332     .5273956    6.632408
                                   |
                        famemp_two |
     One full time, one part time  |   1.401092   .8178162     0.58   0.563     .4462943    4.398577
     One full time, one nonworker  |   .5282689   .2632813    -1.28   0.200     .1988974    1.403076
            Two part time workers  |          1  (empty)
    One part-time, one non-worker  |   1.35e-08   .0000296    -0.01   0.993            0           .
                   Two nonworking  |   .2626393   .2036896    -1.72   0.085     .0574396    1.200903
    ------------------------------------------------------------------------------------------------

    Which model would be more appropriate for my study, regressions with triple interaction terms or a set of small sub samples for the sub groups?


    I really appreciate any comments in advance. Thank you so much!

  • #2
    Waiting for better answers, one remark: you've not used the margins command to report marginal effects (these are general quantities of interest for logit / probit / all nonlinear models, and for nonlinear models they are different to coefficients).

    Also, you may want to read this paper: https://www.sciencedirect.com/scienc...65176503000326.

    Comment


    • #3
      Maxence Morlet Thank you for your comment and the paper. Yes, I didn't use the margins command because margins are not recommended after xtlogit, fe. So I was wondering which estimates I should report for my models. Hopefully my post can have more visibility. Really appreciate your response. The paper and its references are helpful!

      Comment

      Working...
      X