How can i statistically show that there are strong differences among groups?

Luca Haseney

Join Date: Dec 2022
Posts: 22

How can i statistically show that there are strong differences among groups?

15 Jan 2023, 05:40

Dear Statalist community,

I have learned of my mistakes and learned to interprete interaction effects.
I performed a simple OLS regression with a two-way interaction effect, x1 and x2 both are dummy variables.

Code:

. reghdfe Y x1##x2 c1 c2 c3 c4 c5 c6 c7 c8 c9, absorb(FIRM Year) cluster(Industry)
(dropped 27 singleton observations)
(MWFE estimator converged in 7 iterations)

HDFE Linear regression                            Number of obs   =        487
Absorbing 2 HDFE groups                           F(  12,     21) =      63.45
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.4268
                                                  Adj R-squared   =     0.1758
                                                  Within R-sq.    =     0.0595
Number of clusters (Industry) =         22        Root MSE        =     0.7048

                              (Std. err. adjusted for 22 clusters in Industry)
------------------------------------------------------------------------------
             |               Robust
           Y | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        1.x1 |  -1.305025   .0995242   -13.11   0.000    -1.511997   -1.098053
        1.x2 |   .3349848   .1583003     2.12   0.046     .0057813    .6641883
             |
       x1#x2 |
        1 1  |   1.501429   .3299879     4.55   0.000     .8151814    2.187676
             |
          c1 |  -.0236591   .0586029    -0.40   0.691    -.1455305    .0982123
          c2 |  -.0256179   .0314577    -0.81   0.425    -.0910377    .0398018
          c3 |   .1965687   .1940449     1.01   0.323    -.2069697    .6001071
          c4 |   .0366049   .0571484     0.64   0.529    -.0822417    .1554514
          c5 |   .0089514   .0059729     1.50   0.149    -.0034699    .0213726
          c6 |  -.5765208   .5296322    -1.09   0.289    -1.677951    .5249097
          c7 |   .0045407   .0007158     6.34   0.000     .0030521    .0060293
          c8 |   .0104786   .0261328     0.40   0.692    -.0438676    .0648248
          c9 |  -.3780107   .2715036    -1.39   0.178    -.9426333     .186612
       _cons |  -5.923346   2.962192    -2.00   0.059    -12.08356    .2368706

The interpretation of x1 and x2 estimates is the differences to firms who have the other attribute to zero:
Firms who have x1 produce 1.3 less Y1 than firms who have x1=0.
Firms who have x2 produce more Y1 than firms who have x2=0.
Firms whi have x1 and x2 set to one, produce more Y than firms with x1=0 and x2=1, or vice versa.

My question is now, does this suffice to reliably state that firms who have x2=1 provide more Y than firms with x2=0?

If I drop variables with x2=1 (0) and then look at means, I get the following result. Is it realiable to just calcualte differences between these two means, i.e. (-2.45- (2.51)?
How could I then show what influence the interaction with x1 has?

Code:

. drop if x2==1
(381 observations deleted)

. univar Y
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
       Y     133    -2.45     0.81    -4.41    -2.91    -2.34    -1.92    -0.59


. drop if x2==0
(133 observations deleted)

. 
. univar Y
                                        -------------- Quantiles --------------
Variable       n     Mean     S.D.      Min      .25      Mdn      .75      Max
-------------------------------------------------------------------------------
       Y     381    -2.51     0.77    -4.53    -2.98    -2.43    -2.00    -0.56

Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17854

15 Jan 2023, 06:12

Luca:
I would suggest you a three-step approach to accomplish what you're after (it works for me, at least):
1) ask Stata to calculate the fitted values after your regerssion;
2) recalculate them by hand;
3) use -lincom-.

The following toy-example follows the aforementioned strategy:

Code:

. use "C:\Program Files\Stata17\ado\base\a\auto.dta"
(1978 automobile data)

. regress price i.foreign##i.rep78
note: 1.foreign#1b.rep78 identifies no observations in the sample.
note: 1.foreign#2.rep78 identifies no observations in the sample.
note: 1.foreign#5.rep78 omitted because of collinearity.

      Source |       SS           df       MS      Number of obs   =        69
-------------+----------------------------------   F(7, 61)        =      0.39
       Model |    24684607         7  3526372.43   Prob > F        =    0.9049
    Residual |   552112352        61  9051022.16   R-squared       =    0.0428
-------------+----------------------------------   Adj R-squared   =   -0.0670
       Total |   576796959        68  8482308.22   Root MSE        =    3008.5

-------------------------------------------------------------------------------
        price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
--------------+----------------------------------------------------------------
      foreign |
     Foreign  |   2088.167   2351.846     0.89   0.378     -2614.64    6790.974
              |
        rep78 |
           2  |   1403.125   2378.422     0.59   0.557    -3352.823    6159.073
           3  |   2042.574   2204.707     0.93   0.358    -2366.011    6451.159
           4  |   1317.056   2351.846     0.56   0.578    -3385.751    6019.863
           5  |       -360   3008.492    -0.12   0.905    -6375.851    5655.851
              |
foreign#rep78 |
   Foreign#1  |          0  (empty)
   Foreign#2  |          0  (empty)
   Foreign#3  |  -3866.574   2980.505    -1.30   0.199    -9826.462    2093.314
   Foreign#4  |  -1708.278   2746.365    -0.62   0.536    -7199.973    3783.418
   Foreign#5  |          0  (omitted)
              |
        _cons |     4564.5   2127.325     2.15   0.036      310.651    8818.349
-------------------------------------------------------------------------------

. predict fitted, xb

. list make fitted rep78 foreign if rep78==3 & foreign==1

     +---------------------------------------------+
     | make               fitted   rep78   foreign |
     |---------------------------------------------|
 54. | Audi Fox         4828.667       3   Foreign |
 60. | Fiat Strada      4828.667       3   Foreign |
 65. | Renault Le Car   4828.667       3   Foreign |
     +---------------------------------------------+

. di  4564.5+2088.167+2042.574-3866.574
4828.667

. list make fitted rep78 foreign if rep78==3 & foreign==0

     +-------------------------------------------------+
     | make                  fitted   rep78    foreign |
     |-------------------------------------------------|
  1. | AMC Concord         6607.074       3   Domestic |
  2. | AMC Pacer           6607.074       3   Domestic |
  4. | Buick Century       6607.074       3   Domestic |
  6. | Buick LeSabre       6607.074       3   Domestic |
  8. | Buick Regal         6607.074       3   Domestic |
     |-------------------------------------------------|
  9. | Buick Riviera       6607.074       3   Domestic |
 10. | Buick Skylark       6607.074       3   Domestic |
 11. | Cad. Deville        6607.074       3   Domestic |
 13. | Cad. Seville        6607.074       3   Domestic |
 14. | Chev. Chevette      6607.074       3   Domestic |
     |-------------------------------------------------|
 16. | Chev. Malibu        6607.074       3   Domestic |
 19. | Chev. Nova          6607.074       3   Domestic |
 25. | Ford Mustang        6607.074       3   Domestic |
 26. | Linc. Continental   6607.074       3   Domestic |
 27. | Linc. Mark V        6607.074       3   Domestic |
     |-------------------------------------------------|
 28. | Linc. Versailles    6607.074       3   Domestic |
 31. | Merc. Marquis       6607.074       3   Domestic |
 32. | Merc. Monarch       6607.074       3   Domestic |
 34. | Merc. Zephyr        6607.074       3   Domestic |
 36. | Olds Cutl Supr      6607.074       3   Domestic |
     |-------------------------------------------------|
 37. | Olds Cutlass        6607.074       3   Domestic |
 39. | Olds Omega          6607.074       3   Domestic |
 41. | Olds Toronado       6607.074       3   Domestic |
 42. | Plym. Arrow         6607.074       3   Domestic |
 44. | Plym. Horizon       6607.074       3   Domestic |
     |-------------------------------------------------|
 49. | Pont. Grand Prix    6607.074       3   Domestic |
 50. | Pont. Le Mans       6607.074       3   Domestic |
     +-------------------------------------------------+

. . di  4564.5+2042.574
6607.074

. lincom (_b[_cons]+_b[1.foreign]+_b[3.rep78]+_b[1.foreign#3.rep78])-(_b[_cons]+_b[0.foreign]+_b[3.rep78]+_b[0.foreign#3.rep78])

 ( 1)  - 0b.foreign + 1.foreign - 0b.foreign#3o.rep78 + 1.foreign#3.rep78 = 0

------------------------------------------------------------------------------
       price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         (1) |  -1778.407    1830.91    -0.97   0.335    -5439.538    1882.723
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Announcement

How can i statistically show that there are strong differences among groups?

Comment