Dear Statalist,
I have a question about interactions. Say I am interested in comparing coefficients across sub-groups (by gender), and say I am interested in doing this using an interaction model for exposure-by-gender and testing the significance of the interaction coefficients.
But if I have covariates in the model, do I need to interact my effect modifier (gender) with ALL possible variables in the model? I understand that this "fully-interacted" model will generate the same coefficients as in the stratified model.
Whereas if I do not interact these covariates with the effect modifer, then the gender-specific exposure estimates will not match the exposure estimates from the gender-stratified model.
My research question of interest is focused on the exposure-outcome relationship and how it varies by gender, with the covariates serving only as control variables. So would it be better to run only the exposure-by-gender interaction and leave the covariates to the "pooled" sample (without covariate-by-gender interaction)?
A related question is for difference-in-difference methods, which are basically a treatment-by-time interaction. If we have covariates in a DiD model, do we include time interacted with all possible covariates? Or just interacted with the treatment variable?
Sample code in Stata:
I have a question about interactions. Say I am interested in comparing coefficients across sub-groups (by gender), and say I am interested in doing this using an interaction model for exposure-by-gender and testing the significance of the interaction coefficients.
But if I have covariates in the model, do I need to interact my effect modifier (gender) with ALL possible variables in the model? I understand that this "fully-interacted" model will generate the same coefficients as in the stratified model.
Code:
y = exposure + gender + exposure*gender + covariate + covariate*gender with the gender-specific exposure coefficients equal to the stratified model: y = exposure + covariate if gender=0 y = exposure + covariate if gender=1
Code:
y = exposure + gender + exposure*gender + covariate with the gender-specific exposure coefficients NOT equal to the stratified model: y = exposure + covariate if gender=0 y = exposure + covariate if gender=1
A related question is for difference-in-difference methods, which are basically a treatment-by-time interaction. If we have covariates in a DiD model, do we include time interacted with all possible covariates? Or just interacted with the treatment variable?
Code:
y = treatment + post + treatment*post + covariate + covariate*post versus y = treatment + post + treatment*post + covariate
Sample code in Stata:
Code:
sysuse auto, clear *stratified model 1 focusing on length -> price relationship within foreign==0 regress price length weight i.rep78 if foreign==0 lincom _b[length] *stratified model 2 focusing on length -> price relationship within foreign==1 regress price length weight i.rep78 if foreign==1 lincom _b[length] *fully-interacted model, length -> price estimates (within levels of foreign) equivalent to stratified models above quietly regress price c.length##i.foreign c.weight##i.foreign i.rep78##i.foreign lincom _b[length] lincom _b[length]+_b[1.foreign#c.length] *but if only interacting the exposure variable of interest, and not covariates, then length -> price estimates (within levels of foreign) will not be equal to stratified models above quietly regress price c.length##i.foreign c.weight i.rep78 lincom _b[length] lincom _b[length]+_b[1.foreign#c.length]
Comment