Interaction variables

Clara Weber

Join Date: Jan 2024
Posts: 14

Interaction variables

16 Apr 2024, 08:21

Hi everybody!

I am running a fixed effects panel model wheer female labor force participation is the dependent variables and investment into the private sector is the main independent variable. I also want to look at regional differences, so I created region dummies.
I am not quite sure how to continue though... I ran different regressions a got different results but I am not sure which stata command ist the correct one. in 1) I looked at all regions (this is my main regression). In 2) I only included the EAP region, however, my observations obviously decrease. So, in order to keep the sample bigger, I introduce the region dummy in 3) and 4). But what is the exact difference between 3) & 4)? I thought it would be the same thing, just a different way to write the stata command, but I get different results.

1) xtreg FLFP l3.logInvestment logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI i.year, vce(cluster CountryCode_num) fe
outreg2 using Table5.doc, replace ctitle(FE) addstat ("Adjusted R2", e(r2_a), "RMSE", e(rmse)) addtext (Year FE, YES, Robust std. errors, YES )keep(l3.logInvestment EAP c.l3.logInvestment#EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI)

2) xtreg FLFP l3.logInvestment logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI i.year if Reg == "EAP", vce(cluster CountryCode_num) fe
outreg2 using Table5.doc, append ctitle(FE) addstat ("Adjusted R2", e(r2_a), "RMSE", e(rmse)) addtext (Year FE, YES, Robust std. errors, YES )keep(l3.logInvestment EAP c.l3.logInvestment#EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI)

3) xtreg FLFP l3.logInvestment EAP c.l3.logInvestment#EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI i.year, vce(cluster CountryCode_num) fe
outreg2 using Table5.doc, append ctitle(FE) addstat ("Adjusted R2", e(r2_a), "RMSE", e(rmse)) addtext (Year FE, YES, Robust std. errors, YES )keep(l3.logInvestment EAP c.l3.logInvestment#EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI)

4) xtreg FLFPratio c.l3.logInvestment##EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI i.year, vce(cluster CountryCode_num) fe
outreg2 using Table5.doc, append ctitle(FE) addstat ("Adjusted R2", e(r2_a), "RMSE", e(rmse)) addtext (Year FE, YES, Robust std. errors, YES )keep(c.l3.logInvestment##EAP logGDPcurrentPPP l5.fsecondary fertility WBL trade FDI)

	(1)	(2)	(3)	(4)
VARIABLES	FE	FE	FE	FE

L3.logInvestment	0.0702	0.121	0.143	0.273*
	(0.123)	(0.207)	(0.129)	(0.154)
0b.EAP#coL3.logInvestment				0
				(0)
1.EAP#cL3.logInvestment			-0.848*	-1.141**
			(0.451)	(0.505)
logGDPcurrentPPP	1.086	-1.132	1.316	0.0306
	(2.639)	(4.451)	(2.624)	(3.137)
L5.fsecondary	0.0874	0.0470	0.0937	0.199***
	(0.0592)	(0.116)	(0.0587)	(0.0750)
fertility	2.974**	6.456	2.995**	0.913
	(1.363)	(3.818)	(1.330)	(1.502)
WBL	0.000528	-0.00199	-0.00345	0.0107
	(0.0536)	(0.148)	(0.0546)	(0.0704)
trade	-0.00675	0.0292	-0.00634	-0.00730
	(0.0198)	(0.0575)	(0.0197)	(0.0190)
FDI	-0.0663	-0.332**	-0.0491	-0.0659
	(0.0476)	(0.139)	(0.0451)	(0.0626)
Constant	31.28	41.47	29.40	53.66*
	(23.98)	(44.10)	(23.85)	(27.50)

Observations	487	59	487	487
R-squared	0.232	0.543	0.244	0.290
Number of CountryCode_num	92	12	92	92
Year FE	YES	YES	YES	YES
Robust std. errors	YES	YES	YES	YES
Adjusted R2	0.184	0.145	0.194	0.243
RMSE	1.983	1.420	1.970	2.153

Can anyone help me out? What is the difference between 3 and 4 and how would I interpret the interaction terms since the results are different?

Thanks in advance!

Best
Clara

Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17707

16 Apr 2024, 10:25

Clara:
what do tons of literature in your reserch field tell you about the data generating process you're dealing with? This should be the first step to take.
I cannot say what happens with your codes 3) and 4), as the way you reported makes them difficult to read (please, use CODE delimiters whenever possible. Thanks).
The way you wrote the interaction should produce the same results, regardless of using ## or # plus the teram included in the interactions:

Code:

. use "https://www.stata-press.com/data/r18/nlswork.dta"
(National Longitudinal Survey of Young Women, 14-24 years old in 1968)

. xtreg ln_wage age  c.age#c.age i.msp , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,494
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1090                                         min =          1
     Between = 0.1012                                         avg =        6.0
     Overall = 0.0870                                         max =         15

                                                F(3, 4709)        =     338.90
corr(u_i, Xb) = 0.0448                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0551079   .0044026    12.52   0.000     .0464768    .0637391
             |
 c.age#c.age |   -.000616   .0000733    -8.40   0.000    -.0007598   -.0004722
             |
       1.msp |   -.009911   .0073569    -1.35   0.178    -.0243339    .0045119
       _cons |   .6277664   .0631405     9.94   0.000     .5039815    .7515513
-------------+----------------------------------------------------------------
     sigma_u |  .40390097
     sigma_e |  .30235497
         rho |  .64086857   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. xtreg ln_wage c.age##c.age i.msp , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,494
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1090                                         min =          1
     Between = 0.1012                                         avg =        6.0
     Overall = 0.0870                                         max =         15

                                                F(3, 4709)        =     338.90
corr(u_i, Xb) = 0.0448                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0551079   .0044026    12.52   0.000     .0464768    .0637391
             |
 c.age#c.age |   -.000616   .0000733    -8.40   0.000    -.0007598   -.0004722
             |
       1.msp |   -.009911   .0073569    -1.35   0.178    -.0243339    .0045119
       _cons |   .6277664   .0631405     9.94   0.000     .5039815    .7515513
-------------+----------------------------------------------------------------
     sigma_u |  .40390097
     sigma_e |  .30235497
         rho |  .64086857   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

I would also check the specification of the functional form of youe regressand via the following chunks of code that basically mimick the -linktest- procedure (that, unfortunately, does not work after -xt- commands):

Code:

. xtreg ln_wage c.age##c.age i.msp , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,494
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1090                                         min =          1
     Between = 0.1012                                         avg =        6.0
     Overall = 0.0870                                         max =         15

                                                F(3, 4709)        =     338.90
corr(u_i, Xb) = 0.0448                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         age |   .0551079   .0044026    12.52   0.000     .0464768    .0637391
             |
 c.age#c.age |   -.000616   .0000733    -8.40   0.000    -.0007598   -.0004722
             |
       1.msp |   -.009911   .0073569    -1.35   0.178    -.0243339    .0045119
       _cons |   .6277664   .0631405     9.94   0.000     .5039815    .7515513
-------------+----------------------------------------------------------------
     sigma_u |  .40390097
     sigma_e |  .30235497
         rho |  .64086857   (fraction of variance due to u_i)
------------------------------------------------------------------------------

. predict fitted, xb
(40 missing values generated)

. g sq_fitted=fitted^2
(40 missing values generated)

. xtreg ln_wage fitted sq_fitted , fe vce(cluster idcode)

Fixed-effects (within) regression               Number of obs     =     28,494
Group variable: idcode                          Number of groups  =      4,710

R-squared:                                      Obs per group:
     Within  = 0.1093                                         min =          1
     Between = 0.1035                                         avg =        6.0
     Overall = 0.0884                                         max =         15

                                                F(2, 4709)        =     520.74
corr(u_i, Xb) = 0.0472                          Prob > F          =     0.0000

                             (Std. err. adjusted for 4,710 clusters in idcode)
------------------------------------------------------------------------------
             |               Robust
     ln_wage | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
      fitted |   2.312704   .7126323     3.25   0.001     .9156116    3.709797
   sq_fitted |  -.3967852   .2165126    -1.83   0.067    -.8212513    .0276809
       _cons |  -1.079376   .5840422    -1.85   0.065    -2.224372    .0656196
-------------+----------------------------------------------------------------
     sigma_u |  .40346976
     sigma_e |  .30230076
         rho |  .64045931   (fraction of variance due to u_i)
------------------------------------------------------------------------------

.

As the -test- outcome does not reject the null, the model is correctly specified.

Kind regards,
Carlo
(Stata 19.0)

Announcement

Interaction variables

Comment