Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • factor variables and interactions

    Hi,

    Suppose, I have three variables: An outcome, a dummy variable D=0,1, and a variable with three values E=1,2,3.

    I want to estimate the following interacted model (without constant):

    Code:
    gen E1 = E == 1
    gen E2 = E == 2
    gen E3 = E == 3
    gen D_E1 = E1 * D
    gen D_E2 = E2 * D
    gen D_E3 = E3 * D
    
    reg Y E1 E2 E3 D_E1 D_E2 D_E3, nocons
    I'd like to use the stata factor variable notation for this purpose. However, I am not sure whether or not this is possible. When I type

    Code:
    reg Y i.E##i.D, nocons,
    Stata includes a Dummy for E1, for E2, for D1, and interactions for E1#D1 and E2#D1. My model. however, should not include the non-interacted effect of D1 (see above).

    Thank you so much!

  • #2
    Dominique:
    I do hope that the following toy-example can be helpful:
    Code:
    . set obs 12
    number of observations (_N) was 0, now 12
    
    . g D=0 in 1/6
    (6 missing values generated)
    
    . replace D=1 if D==.
    (6 real changes made)
    
    . g E=1 in 1/4
    (8 missing values generated)
    
    . replace E=2 in 5/8
    (4 real changes made)
    
    . replace E=3 if E==.
    (4 real changes made)
    
    . g Y=runiform()*1000
    
    . reg Y i.D##i.E
    note: 0b.D#3.E identifies no observations in the sample
    note: 1.D#1b.E identifies no observations in the sample
    note: 1.D#2.E omitted because of collinearity
    note: 1.D#3.E omitted because of collinearity
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(3, 8)         =      2.72
           Model |  552323.622         3  184107.874   Prob > F        =    0.1148
        Residual |   541990.13         8  67748.7663   R-squared       =    0.5047
    -------------+----------------------------------   Adj R-squared   =    0.3190
           Total |  1094313.75        11  99483.0684   Root MSE        =    260.29
    
    ------------------------------------------------------------------------------
               Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             1.D |  -412.6576   260.2859    -1.59   0.152    -1012.878    187.5629
                 |
               E |
              2  |   414.6539   225.4142     1.84   0.103    -105.1522    934.4601
              3  |    849.558   318.7839     2.66   0.029     114.4411    1584.675
                 |
             D#E |
            0 3  |          0  (empty)
            1 1  |          0  (empty)
            1 2  |          0  (omitted)
            1 3  |          0  (omitted)
                 |
           _cons |   195.2401    130.143     1.50   0.172    -104.8701    495.3504
    ------------------------------------------------------------------------------
    
    . reg Y i.D##i.E, nocons
    note: 0b.D#3.E identifies no observations in the sample
    note: 1.D#1b.E identifies no observations in the sample
    note: 1.D#2.E omitted because of collinearity
    note: 1.D#3.E omitted because of collinearity
    
          Source |       SS           df       MS      Number of obs   =        12
    -------------+----------------------------------   F(3, 9)         =     10.45
           Model |  2420152.94         3  806717.646   Prob > F        =    0.0027
        Residual |  694464.989         9  77162.7766   R-squared       =    0.7770
    -------------+----------------------------------   Adj R-squared   =    0.7027
           Total |  3114617.93        12  259551.494   Root MSE        =    277.78
    
    ------------------------------------------------------------------------------
               Y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             1.D |  -412.6576   277.7819    -1.49   0.172    -1041.044    215.7287
                 |
               E |
              2  |   609.8941   196.4215     3.11   0.013     165.5579     1054.23
              3  |   1044.798   310.5696     3.36   0.008     342.2409    1747.355
                 |
             D#E |
            0 3  |          0  (empty)
            1 1  |          0  (empty)
            1 2  |          0  (omitted)
            1 3  |          0  (omitted)
    ------------------------------------------------------------------------------
    
    .
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      if you use the "allbase" option in your regression command, you will be shown all terms and will see why Stata is not including certain terms

      Comment


      • #4
        Thanks a lot for your help!

        Comment

        Working...
        X