running regression separately for sub samples vs joint interaction terms

Ama Perera

Join Date: Mar 2019

Posts: 43
#1

running regression separately for sub samples vs joint interaction terms

22 Feb 2021, 00:22

Hello everyone,

I have a panel data set which comprises of 25 countries and 2000 banks. In order to test if the effect of my independent variables (x1, x2, x3,x4) on y (dependent variable) changes as per the level of development of countries, I run my base regression separately for developed countries and emerging countries, which is similar to;

Method 1

xtreg y x1 x2 x3 x4 i.year if dev=="Developed", fe vce(cluster firm_id) (1)

and

xtreg y x1 x2 x3 x4 i.year if dev=="Emerging", fe vce(cluster firm_id) (2)

Once those regressions are performed, the coefficient of x2 on developed countries is positive statistically significant at 1% but the corresponding coefficient of x2 on emerging countries is positive but not statistically significant, according to p values. Hence, I interpreted the results as, the effect of x2 on y is only significant in developed countries.

However, when I run a joint model with interaction terms as indicated below, the interaction term between x2 and the dummy group variable (d.dummy = 1 if countries are developed and 0 if they are emerging), that interaction term is not significant.

Method 2

xtreg x1 x2 x3 x4 i.year (c.x1 c.x2 c.x3 c.x4 i.year)#d.dummy, fe vce(cluster firm_id)

d_dummy#c.x2 p value is 0.914.

All other statistics are similar; in terms of the number of observations and I get same coefficients for all independent variables from both methods.

My questions are;
(01) is it incorrect to interpret that the effect of x2 on y is significant only in developed countries as I concluded by method 1, running regressions for two sub samples?
(02) Why are the results different between the two methods?
(03) Are there any other methods to test if the coefficients are significantly different from each other across groups? Is performing Wald test possible and if so, can someone help me with the command to conduct it?

Thank You.

Last edited by Ama Perera; 22 Feb 2021, 00:34.
Tags: interaction, panel data
lal mohan kumar

Join Date: May 2019

Posts: 265
#2

22 Feb 2021, 00:34

There are experts in this forum to help you in this regard. However, I shall share 3 sources which I found useful for my personal understanding
1.https://anhqle.github.io/interaction-term/
The quintessence is that,sub sample analysis is equivalent to fully interacted model, it shows how all coefficients (not just the coefficient of the variable of interest) differ across group.
2. https://www3.nd.edu/~rwilliam/stats2/l51.pdf by Richard Williams
3. https://www.scielo.br/pdf/jbpneu/v43...3-03-00162.pdf
1 like
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

22 Feb 2021, 01:56

Ama:
as an aside to Ial's helpful advice:
1) with 01 you're implicitly taking the effect of covariates to be the same for both the two regression, which is rarely (if ever) the case;
2) the results are different because the two models (with and without interaction) are different;
3) you can apply -test- on the coefficients retrieved via:

Code:

mat list e(b)

as you can see in the following toy-example:

Code:

use "https://www.stata-press.com/data/r16/nlswork.dta"
. xtreg ln_wage i.c_city if nev_mar==0, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     21,963
Group variable: idcode                          Number of groups  =      3,999

R-sq:                                           Obs per group:
     within  = 0.0004                                         min =          1
     between = 0.0047                                         avg =        5.5
     overall = 0.0019                                         max =         15

                                                F(1,3998)         =       3.45
corr(u_i, Xb)  = -0.0840                        Prob > F          =     0.0634

                             (Std. Err. adjusted for 3,999 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    1.c_city |  -.0250912   .0135141    -1.86   0.063    -.0515864    .0014039
       _cons |   1.697672   .0043201   392.97   0.000     1.689202    1.706142
-------------+----------------------------------------------------------------
     sigma_u |  .43013416
     sigma_e |   .3152648
         rho |  .65052972   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtreg ln_wage i.c_city if nev_mar==1, fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =      6,547
Group variable: idcode                          Number of groups  =      1,972

R-sq:                                           Obs per group:
     within  = 0.0001                                         min =          1
     between = 0.0105                                         avg =        3.3
     overall = 0.0087                                         max =         15

                                                F(1,1971)         =       0.13
corr(u_i, Xb)  = -0.1215                        Prob > F          =     0.7234

                             (Std. Err. adjusted for 1,972 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    1.c_city |  -.0096975   .0274014    -0.35   0.723    -.0634362    .0440412
       _cons |   1.630594   .0132215   123.33   0.000     1.604664    1.656523
-------------+----------------------------------------------------------------
     sigma_u |   .4122355
     sigma_e |  .29875853
         rho |  .65563865   (fraction of variance due to u_i)
------------------------------------------------------------------------------


. xtreg ln_wage i.c_city##i.nev_mar , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,711

R-sq:                                           Obs per group:
     within  = 0.0265                                         min =          1
     between = 0.0006                                         avg =        6.1
     overall = 0.0024                                         max =         15

                                                F(3,4710)         =     129.13
corr(u_i, Xb)  = -0.1533                        Prob > F          =     0.0000

                               (Std. Err. adjusted for 4,711 clusters in idcode)
--------------------------------------------------------------------------------
               |               Robust
       ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
      1.c_city |  -.0177856   .0121675    -1.46   0.144    -.0416395    .0060684
     1.nev_mar |  -.1885751   .0128516   -14.67   0.000    -.2137702     -.16338
               |
c_city#nev_mar |
          1 1  |     -.0072    .018072    -0.40   0.690    -.0426296    .0282295
               |
         _cons |   1.725467   .0047976   359.65   0.000     1.716062    1.734873
---------------+----------------------------------------------------------------
       sigma_u |  .42912314
       sigma_e |  .31599923
           rho |  .64839878   (fraction of variance due to u_i)
--------------------------------------------------------------------------------

 
. mat list e(b)

e(b)[1,9]
            0b.          1.         0b.          1.  0b.c_city#  0b.c_city#  1o.c_city#   1.c_city#           
        c_city      c_city     nev_mar     nev_mar  0b.nev_mar  1o.nev_mar  0b.nev_mar   1.nev_mar       _cons
y1           0  -.01778558           0  -.18857512           0           0           0  -.00720002   1.7254673

. test 0b.c_city#1o.nev_mar=1.c_city#1.nev_mar

 ( 1)  0b.c_city#1o.nev_mar - 1.c_city#1.nev_mar = 0

       F(  1,  4710) =    0.16
            Prob > F =    0.6903
.

Kind regards,
Carlo
(Stata 19.0)

Comment

Ama Perera

Join Date: Mar 2019

Posts: 43
#4

22 Feb 2021, 17:27

Thanks lal for the helpful resources and thank you Carlo for the useful demonstration.
Comment

Announcement

running regression separately for sub samples vs joint interaction terms

Comment

Comment

Comment