Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interaction terms in panel data

    Hi there

    I am having a problem with including interaction terms in my panel data fixed effects model, and I would be grateful for any help:

    I am working with panel data where ID identifies a specific individual across waves. I am studying the effect of different unemployment durations on my dependent variable (y). I have numerous dummy variables representing different durations of unemployment (unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs), and also a dummy variable for sex. I want to compare the coefficients on the unemployment durations between male and females.

    I can get coefficients for each group by typing:

    (a)
    Code:
    xtreg y unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs if sex=0, fe vce(cluster ID)
    Code:
    xtreg y unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs if sex=1, fe vce(cluster ID)
    However, I realised I don't think I can test for the coefficients being statistically significantly different if they are in separate regression, because 'suest' can't be used with panel data (please correct me if I am wrong)

    So I tried creating a series of interaction terms:
    Code:
    xtreg y i.sex##(unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs), fe vce(cluster ID)

    My understanding is that this should have produced the same coefficients for unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs as when sex=0, and the coefficients of the interaction term should have been the difference between the coefficients when sex=1 and when sex=0.

    However, Stata does not seem to do this. Instead the coefficients produced when unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs are not part of an interaction term are not the same as those produced when regressing either group alone(as in (a) above). And the interaction term coefficients are not the difference between the two coefficients in (a)

    Also, the following odd things happen:
    1) I get the messages note: "1.sex omitted because of collinearity" and "note: sex omitted because of collinearity"
    2) Stata puts a ''1.'' in front of all the variables (unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs) when they are not part of an interaction term

    From everything I have read on the internet, I can't work out why the coefficients don't match up. I would be so grateful for any help, I just don't understand what is happening!

    This is a sample of my data if this helps.

    clear
    input long ID float y byte sex float(unemp_lessthan4mths unemp_4to8mths unemp_8to12mths unemp_1to2yrs)
    123 10 1 0 0 0 0
    123 11 1 0 0 0 0
    223 12 1 0 0 0 0
    223 12 1 0 0 0 0
    223 12 1 0 0 0 0
    323 12 0 0 0 0 0
    323 6 0 0 0 0 0
    323 9 0 0 0 0 0
    423 12 0 0 0 0 0
    423 12 0 0 0 0 0
    423 12 0 0 0 0 0
    523 12 1 0 0 0 0
    523 12 1 0 0 0 0
    523 12 1 0 0 0 0
    523 12 1 0 0 0 0
    523 12 1 0 0 0 0
    523 12 1 0 0 0 0
    623 12 1 0 0 0 0
    623 2 1 0 0 0 0
    623 12 1 0 0 0 0
    623 10 1 0 0 0 0
    623 11 1 0 0 0 0
    723 5 0 0 0 0 0
    723 10 0 0 0 0 0
    723 0 0 0 0 0 0
    723 0 0 0 0 0 0
    723 12 0 0 0 0 0
    723 9 0 0 0 0 0
    723 12 0 0 0 0 0
    723 11 0 0 0 0 0
    723 12 0 0 0 0 0
    723 12 0 0 0 0 0
    823 12 1 0 0 0 0
    823 12 1 0 0 0 0
    823 12 1 0 0 0 0
    823 12 1 0 0 0 0
    923 12 0 0 0 0 0
    923 12 0 0 0 0 0
    923 9 0 0 0 0 0
    923 12 0 0 0 0 0
    923 11 0 0 0 0 0
    923 10 0 0 0 0 0
    923 10 0 0 0 0 0
    923 9 0 0 0 0 0
    923 11 0 0 0 0 0
    923 12 0 0 0 0 0
    124 8 1 0 0 0 0
    124 6 1 0 0 0 0
    124 3 1 0 0 0 0
    124 9 1 0 0 0 0
    224 12 0 0 0 0 0
    224 12 0 0 0 0 0
    224 12 0 0 0 0 0
    224 11 0 0 0 0 0
    224 11 0 0 0 0 0
    224 12 0 0 0 0 0
    224 11 0 0 0 0 0
    224 12 0 0 0 0 0
    224 12 0 0 0 0 0
    224 11 0 0 0 0 0
    224 12 0 0 0 0 0
    225 12 1 0 0 0 0

  • #2
    Welcome to the Stata Forum / Statalist,

    With regards to sex in fixed effects model, since it is a time invariant variable, it is not expected to be part of the output. Also, it seems other variables have the same pattern, but further information is needed.

    The 'odd things', i.e., the number 1 before variables, that is due to the factor notation.

    I'm not sure whether this message clarified your doubts.

    I kindly recommend to share data under CODE delimiters or by installing the SSC dataex, as recommended in the FAQ.

    This is the best way to entice helpful replies.
    Best regards,

    Marcos

    Comment


    • #3
      Adding to Marcos' fine advice, the expectation that the two separate regressions would produce results matching the interaction regression in the way you describe is incorrect. That would happen with a non-hierarchical model, e.g. -regress-. But when you combine the two sexes into a single sample for fixed-effects regression you are not only adding more observations: you are adding additional variables to the model. The specific new variables you are adding are the person-level fixed effects from the opposite sex. So it is actually a different regression equation, even though it looks, superficially, like the same one you were separately fitting to males and females. In this regard it is like any other regression situation: if you add some new variables to a pre-existing model, the results can change.

      Comment


      • #4
        The coefficients should match up. However, dropping sex is certainly expected because it does not change over time. But that doesn't affect the other coefficients. Below is a similar example, where "black" is a binary variable that does not change over time and union and married are binary variables that do. BTW, I assume there is an omitted group in the unemployment duration dummies? Unemployed longer than two years? Not unemployed? If you have not dropped a base group dummy then it could make the results hard to interpret.

        Code:
        . xtreg lwage i.black##(married union), fe
        note: 1.black omitted because of collinearity
        
        Fixed-effects (within) regression               Number of obs     =      4,360
        Group variable: nr                              Number of groups  =        545
        
        R-sq:                                           Obs per group:
             within  = 0.0515                                         min =          8
             between = 0.0553                                         avg =        8.0
             overall = 0.0535                                         max =          8
        
                                                        F(4,3811)         =      51.71
        corr(u_i, Xb)  = -0.0061                        Prob > F          =     0.0000
        
        -------------------------------------------------------------------------------
                lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        --------------+----------------------------------------------------------------
              1.black |          0  (omitted)
            1.married |   .2481957   .0184892    13.42   0.000      .211946    .2844453
              1.union |    .050089   .0226769     2.21   0.027      .005629     .094549
                      |
        black#married |
                 1 1  |  -.0874722   .0628031    -1.39   0.164     -.210603    .0356587
                      |
          black#union |
                 1 1  |   .1275605   .0559304     2.28   0.023     .0179041    .2372169
                      |
                _cons |   1.525105   .0108442   140.64   0.000     1.503844    1.546366
        --------------+----------------------------------------------------------------
              sigma_u |  .37981005
              sigma_e |  .37733089
                  rho |  .50327434   (fraction of variance due to u_i)
        -------------------------------------------------------------------------------
        F test that all u_i=0: F(544, 3811) = 7.94                   Prob > F = 0.0000
        
        . xtreg lwage married union if ~black, fe
        
        Fixed-effects (within) regression               Number of obs     =      3,856
        Group variable: nr                              Number of groups  =        482
        
        R-sq:                                           Obs per group:
             within  = 0.0520                                         min =          8
             between = 0.0494                                         avg =        8.0
             overall = 0.0504                                         max =          8
        
                                                        F(2,3372)         =      92.45
        corr(u_i, Xb)  = -0.0223                        Prob > F          =     0.0000
        
        ------------------------------------------------------------------------------
               lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             married |   .2481957   .0185088    13.41   0.000      .211906    .2844853
               union |    .050089    .022701     2.21   0.027       .00558    .0945981
               _cons |   1.539406   .0116903   131.68   0.000     1.516486    1.562327
        -------------+----------------------------------------------------------------
             sigma_u |  .37332057
             sigma_e |  .37773147
                 rho |  .49412724   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(481, 3372) = 7.67                   Prob > F = 0.0000
        
        . xtreg lwage married union if black, fe
        
        Fixed-effects (within) regression               Number of obs     =        504
        Group variable: nr                              Number of groups  =         63
        
        R-sq:                                           Obs per group:
             within  = 0.0475                                         min =          8
             between = 0.0865                                         avg =        8.0
             overall = 0.0679                                         max =          8
        
                                                        F(2,439)          =      10.94
        corr(u_i, Xb)  = 0.0738                         Prob > F          =     0.0000
        
        ------------------------------------------------------------------------------
               lwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
             married |   .1607235   .0595281     2.70   0.007      .043728     .277719
               union |   .1776496   .0507081     3.50   0.001     .0779887    .2773105
               _cons |   1.415689   .0283649    49.91   0.000     1.359942    1.471437
        -------------+----------------------------------------------------------------
             sigma_u |  .41321237
             sigma_e |  .37423968
                 rho |  .54937121   (fraction of variance due to u_i)
        ------------------------------------------------------------------------------
        F test that all u_i=0: F(62, 439) = 9.61                     Prob > F = 0.0000
        Notice in the first estimation the coefficients on 1.married and 1. union correspond to the black = 0 group, and the coefficients on the interactions are the differences between the black = 1 and black = 0 groups.

        Comment


        • #5
          Dear Everyone

          Thank you all so much for you help. I now understand much better!

          Also apologies for not including my data in the correct format, I have now installed the SSC dataex for the future.

          Best wishes

          Jay

          Comment

          Working...
          X