Hi,
I was experimenting with interaction terms using binary variables and found an unexpected result when using # and ## for binary variables. To preface I understand the difference between # and ## when using categoric/binary and continuous variables; it is just the case of two binary variables that is problematic.
In the code below, I run two regressions, and the coefficient on the interaction term differs between the two. To me, both regressions include high_mpg and domestic on their own, and both have corresponding identical coefficients to support that. But the interaction coefficient differs. What is happening that means the coefficient is different?
sysuse auto, clear
tab foreign, g(foreign_)
ren foreign_1 domestic
gen low_mpg = 0
replace low_mpg = 1 if mpg <= 20
gen high_mpg = 0
replace high_mpg = 1 if mpg > 20
reg price i.high_mpg#i.domestic gear_ratio
reg price i.high_mpg##i.domestic gear_ratio
Results for single #
reg price high_mpg#domestic gear_ratio
-----------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
------------------+----------------------------------------------------------------
high_mpg#domestic |
0 1 | -4427.653 1330.882 -3.33 0.001 -7082.69 -1772.617
1 0 | -2053.396 1393.149 -1.47 0.145 -4832.652 725.8594
1 1 | -4914.193 1322.372 -3.72 0.000 -7552.253 -2276.133
|
gear_ratio | -3862.311 1040.507 -3.71 0.000 -5938.065 -1786.557
_cons | 21517.58 3500.157 6.15 0.000 14534.95 28500.2
-----------------------------------------------------------------------------------
Results for double ##
reg price high_mpg##domestic gear_ratio
-----------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
------------------+----------------------------------------------------------------
1.high_mpg | -2053.396 1393.149 -1.47 0.145 -4832.652 725.8594
1.domestic | -4427.653 1330.882 -3.33 0.001 -7082.69 -1772.617
|
high_mpg#domestic |
1 1 | 1566.856 1528.869 1.02 0.309 -1483.154 4616.866
|
gear_ratio | -3862.311 1040.507 -3.71 0.000 -5938.065 -1786.557
_cons | 21517.58 3500.157 6.15 0.000 14534.95 28500.2
-----------------------------------------------------------------------------------
I was experimenting with interaction terms using binary variables and found an unexpected result when using # and ## for binary variables. To preface I understand the difference between # and ## when using categoric/binary and continuous variables; it is just the case of two binary variables that is problematic.
In the code below, I run two regressions, and the coefficient on the interaction term differs between the two. To me, both regressions include high_mpg and domestic on their own, and both have corresponding identical coefficients to support that. But the interaction coefficient differs. What is happening that means the coefficient is different?
sysuse auto, clear
tab foreign, g(foreign_)
ren foreign_1 domestic
gen low_mpg = 0
replace low_mpg = 1 if mpg <= 20
gen high_mpg = 0
replace high_mpg = 1 if mpg > 20
reg price i.high_mpg#i.domestic gear_ratio
reg price i.high_mpg##i.domestic gear_ratio
Results for single #
reg price high_mpg#domestic gear_ratio
-----------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
------------------+----------------------------------------------------------------
high_mpg#domestic |
0 1 | -4427.653 1330.882 -3.33 0.001 -7082.69 -1772.617
1 0 | -2053.396 1393.149 -1.47 0.145 -4832.652 725.8594
1 1 | -4914.193 1322.372 -3.72 0.000 -7552.253 -2276.133
|
gear_ratio | -3862.311 1040.507 -3.71 0.000 -5938.065 -1786.557
_cons | 21517.58 3500.157 6.15 0.000 14534.95 28500.2
-----------------------------------------------------------------------------------
Results for double ##
reg price high_mpg##domestic gear_ratio
-----------------------------------------------------------------------------------
price | Coefficient Std. err. t P>|t| [95% conf. interval]
------------------+----------------------------------------------------------------
1.high_mpg | -2053.396 1393.149 -1.47 0.145 -4832.652 725.8594
1.domestic | -4427.653 1330.882 -3.33 0.001 -7082.69 -1772.617
|
high_mpg#domestic |
1 1 | 1566.856 1528.869 1.02 0.309 -1483.154 4616.866
|
gear_ratio | -3862.311 1040.507 -3.71 0.000 -5938.065 -1786.557
_cons | 21517.58 3500.157 6.15 0.000 14534.95 28500.2
-----------------------------------------------------------------------------------
Comment