Dear Statalist,
I am interested in the interpretation of the interaction term of two dummy/indicator variables.
Please, find below an illustrative example below:
I download a wage dataset from "campus.lakeforest.edu/lemke/econ330/stata/lab5/wages.dta"
I generate ln(wage) as dependent variable, an indicator for being black vs. non-black from the categorical race variable, and recode the female string variable as indicator variable.
Then I run the following regression model:
. reg lnwage i.female##i.black, r
Linear regression Number of obs = 704
F( 3, 700) = 3.26
Prob > F = 0.0210
R-squared = 0.0130
Root MSE = 1.0749
------------------------------------------------------------------------------
| Robust
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | -.2639805 .0883198 -2.99 0.003 -.437384 -.0905769
1.black | -.204688 .1329487 -1.54 0.124 -.465714 .056338
|
female#black |
1 1 | .4608161 .2452047 1.88 0.061 -.0206087 .9422409
|
_cons | 9.880011 .0578324 170.84 0.000 9.766465 9.993557
------------------------------------------------------------------------------
Question 1: What is the respective control group for a black female?
Question 2: How do I interpret the interaction term (black female) correctly?
Question 3: How does this interpretation differ from a classic difference-in-differences interaction term?
Now, I change the regression by keeping the race variable with its three levels (black, hispanic, white):
. reg lnwage i.female##i.race, r
Linear regression Number of obs = 704
F( 5, 698) = 2.69
Prob > F = 0.0204
R-squared = 0.0182
Root MSE = 1.0736
------------------------------------------------------------------------------
| Robust
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | .1968356 .229074 0.86 0.390 -.2529209 .6465922
|
race |
2 | -.0781757 .1776793 -0.44 0.660 -.4270256 .2706742
3 | .245223 .1355059 1.81 0.071 -.0208249 .511271
|
female#race |
1 2 | -.1330989 .2964309 -0.45 0.654 -.715102 .4489042
1 3 | -.5099186 .249273 -2.05 0.041 -.9993334 -.0205038
|
_cons | 9.675323 .1198826 80.71 0.000 9.439949 9.910697
------------------------------------------------------------------------------
Question 4: How do I interpret the two interaction effects?
Any help or reference is highly appreciated.
I am interested in the interpretation of the interaction term of two dummy/indicator variables.
Please, find below an illustrative example below:
I download a wage dataset from "campus.lakeforest.edu/lemke/econ330/stata/lab5/wages.dta"
I generate ln(wage) as dependent variable, an indicator for being black vs. non-black from the categorical race variable, and recode the female string variable as indicator variable.
Code:
gen lnwage = ln(wage) gen byte black = 0 replace black = 1 if (race == 1) gen byte female = 0 replace female = 1 if (sex == "F")
Code:
reg lnwage i.female##i.black, r
. reg lnwage i.female##i.black, r
Linear regression Number of obs = 704
F( 3, 700) = 3.26
Prob > F = 0.0210
R-squared = 0.0130
Root MSE = 1.0749
------------------------------------------------------------------------------
| Robust
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | -.2639805 .0883198 -2.99 0.003 -.437384 -.0905769
1.black | -.204688 .1329487 -1.54 0.124 -.465714 .056338
|
female#black |
1 1 | .4608161 .2452047 1.88 0.061 -.0206087 .9422409
|
_cons | 9.880011 .0578324 170.84 0.000 9.766465 9.993557
------------------------------------------------------------------------------
Question 1: What is the respective control group for a black female?
Question 2: How do I interpret the interaction term (black female) correctly?
Question 3: How does this interpretation differ from a classic difference-in-differences interaction term?
Now, I change the regression by keeping the race variable with its three levels (black, hispanic, white):
Code:
reg lnwage i.female##i.race, r
. reg lnwage i.female##i.race, r
Linear regression Number of obs = 704
F( 5, 698) = 2.69
Prob > F = 0.0204
R-squared = 0.0182
Root MSE = 1.0736
------------------------------------------------------------------------------
| Robust
lnwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.female | .1968356 .229074 0.86 0.390 -.2529209 .6465922
|
race |
2 | -.0781757 .1776793 -0.44 0.660 -.4270256 .2706742
3 | .245223 .1355059 1.81 0.071 -.0208249 .511271
|
female#race |
1 2 | -.1330989 .2964309 -0.45 0.654 -.715102 .4489042
1 3 | -.5099186 .249273 -2.05 0.041 -.9993334 -.0205038
|
_cons | 9.675323 .1198826 80.71 0.000 9.439949 9.910697
------------------------------------------------------------------------------
Question 4: How do I interpret the two interaction effects?
Any help or reference is highly appreciated.
Comment