Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple interaction terms with binary variables

    I have a dataset that contains a sample a household survey in the US on a single time period, and I'm trying to explain the factors that affect wages.

    In my regression I include the variables ch02 and ch35, which represent the number of children between that age range (0 to 2 and 3 to 5) that an individual has.

    I continue to add an interaction term to account for age in the following manner: ch02#c.age

    Now I want to know the difference in effects that ch02 and ch35 have between the wages of men and women, for which I have the binary variable female (0=male 1= female)

    I have tried to run the regression with the following interaction term: ch02#c.age#female, but I get multicollinearity errors, as shown bellow, and don't know how to interpret the coefficients, plus when later runing the vif command to check for multicollinearity, I obtain I really high value for the variable female.

    Am I making a mistake with the interaction term? Is there another method?

    Code:
    . reg wage age female i.wbhaom citizen married ch02#c.age#female ch35#c.age#female unmem multjob rural i(8/16).educ92 ind_m03 occ_m03 uhou
    > rse, robust
    note: 1.ch35#1.female#c.age omitted because of collinearity.
    
    Linear regression                               Number of obs     =     51,188
                                                    F(28, 51159)      =     687.72
                                                    Prob > F          =     0.0000
                                                    R-squared         =     0.2266
                                                    Root MSE          =     17.489
    
    -----------------------------------------------------------------------------------------------------------
                                              |               Robust
                                         wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
    ------------------------------------------+----------------------------------------------------------------
                                          age |   .3992368   .0190215    20.99   0.000     .3619544    .4365192
                                       female |  -2.042734   .7446907    -2.74   0.006    -3.502335   -.5831322
                                              |
                                       wbhaom |
                                       Black  |  -2.974119   .2341845   -12.70   0.000    -3.433122   -2.515115
                                    Hispanic  |  -2.053643   .1825987   -11.25   0.000    -2.411538   -1.695748
                                       Asian  |   2.248816   .3029651     7.42   0.000     1.655001     2.84263
                             Native American  |  -1.093605   .5521478    -1.98   0.048    -2.175821   -.0113901
                                       Mixed  |  -1.240459   .4965521    -2.50   0.012    -2.213706   -.2672112
                                              |
                                      citizen |   1.755564   .2445851     7.18   0.000     1.276175    2.234953
                                      married |   1.067823   .1619556     6.59   0.000     .7503881    1.385257
                                              |
                            ch02#female#c.age |
                                         0 1  |   -.089905   .0250814    -3.58   0.000    -.1390648   -.0407452
                                         1 0  |   .0100487   .0071585     1.40   0.160    -.0039821    .0240795
                                         1 1  |   -.044066    .028712    -1.53   0.125    -.1003418    .0122098
                                              |
                            ch35#female#c.age |
                                         0 1  |  -.0424212   .0087071    -4.87   0.000    -.0594873   -.0253552
                                         1 0  |   .0400512   .0070497     5.68   0.000     .0262338    .0538686
                                         1 1  |          0  (omitted)
                                              |
                                        unmem |   1.627953   .2291285     7.10   0.000     1.178859    2.077047
                                      multjob |  -.9475845   .6570856    -1.44   0.149    -2.235479    .3403102
                                        rural |  -2.577632   .2332299   -11.05   0.000    -3.034765   -2.120499
                                              |
                                       educ92 |
                            HS graduate, GED  |   3.905782   .2178174    17.93   0.000     3.478858    4.332706
                  Some college but no degree  |   5.537558   .2608483    21.23   0.000     5.026293    6.048824
    Associate degree-occupational/vocational  |   6.841235    .346747    19.73   0.000     6.161607    7.520863
           Associate degree-academic program  |   6.360889   .2959218    21.50   0.000      5.78088    6.940899
                           Bachelor's degree  |   14.33761   .3526292    40.66   0.000     13.64645    15.02877
                             Master's degree  |   18.76326   .4418553    42.46   0.000     17.89722     19.6293
                         Professional school  |   24.19885   .7906149    30.61   0.000     22.64924    25.74846
                                   Doctorate  |   23.27138   .6032447    38.58   0.000     22.08902    24.45375
                                              |
                                      ind_m03 |  -.6025383   .0299457   -20.12   0.000    -.6612321   -.5438445
                                      occ_m03 |  -.9488958   .0495613   -19.15   0.000    -1.046037   -.8517551
                                      uhourse |  -.0694116   .0267714    -2.59   0.010    -.1218839   -.0169394
                                        _cons |   15.20742   .9416845    16.15   0.000      13.3617    17.05313
    -----------------------------------------------------------------------------------------------------------
    
    . vif
    
        Variable |       VIF       1/VIF  
    -------------+----------------------
             age |      2.33    0.429822
          female |     29.93    0.033412
          wbhaom |
              2  |      1.10    0.911629
              3  |      1.35    0.743107
              4  |      1.16    0.864481
              5  |      1.01    0.988987
              6  |      1.01    0.989727
         citizen |      1.32    0.755477
         married |      1.15    0.870870
     ch02#female#|
           c.age |
            0 1  |     35.93    0.027828
            1 0  |      1.22    0.820696
            1 1  |      9.00    0.111072
     ch35#female#|
           c.age |
            0 1  |      4.30    0.232823
            1 0  |      1.19    0.839867
           unmem |      1.04    0.963348
         multjob |      1.01    0.987560
           rural |      1.06    0.939718
          educ92 |
              9  |      4.05    0.246908
             10  |      3.33    0.300139
             11  |      1.89    0.529599
             12  |      2.20    0.454326
             13  |      4.97    0.201063
             14  |      3.50    0.285900
             15  |      1.39    0.718809
             16  |      1.58    0.634403
         ind_m03 |      1.32    0.758237
         occ_m03 |      1.54    0.648567
         uhourse |      1.13    0.888345
    -------------+----------------------
        Mean VIF |      4.36
    
    .

  • #2
    Luis:
    I would rewrite your code as follows (I assume that -ch*- are categorical variables):
    Code:
    reg wage i.wbhaom citizen married i.ch02##c.age##i.female i.ch35##c.age##i.female unmem multjob rural i(8/16).educ92 ind_m03 occ_m03 uhourse, robust
    That said:
    1) your model is way too complicated. Whenever I use interactions, the first question I address to myself is: How will I explain the results of the interaction(s) to myself and to reviewers?. If the answer sounds not that convincing, I switch to a more parsimonious specification;
    2) Stata tells you that there is perfect multicollinarity between -1.ch35#1.female#c.age- and something else (-ch35#c.age#female-, I suspect). Therefore, Stata omits one part of the problem;
    3) your average VIF is not a source of concern. As expected, intearction terms show high(er) VIF values.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment

    Working...
    X