Difference between interaction term or subgroup analysis when introducing control variables

Gustav Egede Hansen

Join Date: May 2021
Posts: 94

Difference between interaction term or subgroup analysis when introducing control variables

21 Dec 2022, 04:22

Hi everybody

I have a question. I am looking into the difference between public and outsourced employees regarding job engagement. I usually use an interaction term to conduct this analysis. However, I have been asked by a reviewer to include a subgroup analysis to test differences in slopes. My intuition told me this difference would be identical to the interaction term. What I find is that without control variables, everything is similar in terms of point estimates and SE (except for minor differences, which I assume is due to using a z-test (in suest) and t-test (in the interaction analysis)). However, when introducing control variables, the approaches differ in terms of SE but also, surprisingly, in point estimates. Can anybody elaborate on this difference?

Without control variables:

Code:

reg engagement i.treatment_matched##c.workload [iw =cem_weights], vce(robust)
 
foreach var1 of varlist engagement  {
foreach var2 of varlist  workload {
qui reg `var1' c.`var2'  [iw =cem_weights] if treatment_matched == 0 // , vce(robust)
est store treatment_matched0
qui reg `var1' c.`var2'  [iw =cem_weights] if treatment_matched == 1 // , vce(robust)
est store treatment_matched1
di "`var1'" " & " "`var2'"
qui suest treatment_matched0 treatment_matched1, vce(robust) coefl
lincom (_b[treatment_matched1_mean:`var2']-_b[treatment_matched0_mean:`var2'])
}
}

With control variables:

Code:

reg engagement i.treatment_matched##c.workload c.alder_alt ib(1).kon ib(2).uddannelse [iw =cem_weights] , vce(robust)
 
foreach var1 of varlist engagement  {
foreach var2 of varlist  workload {
qui reg `var1' c.`var2'  c.alder_alt ib(1).kon ib(2).uddannelse [iw =cem_weights] if treatment_matched == 0 // , vce(robust)
est store treatment_matched0
qui reg `var1' c.`var2'  c.alder_alt ib(1).kon ib(2).uddannelse [iw =cem_weights] if treatment_matched == 1 // , vce(robust)
est store treatment_matched1
di "`var1'" " & " "`var2'"
qui suest treatment_matched0 treatment_matched1, vce(robust) coefl
lincom (_b[treatment_matched1_mean:`var2']-_b[treatment_matched0_mean:`var2'])
}
}

Thanks!

Best
Gustav

Tags: None

Gustav Egede Hansen

Join Date: May 2021

Posts: 94
#2

21 Dec 2022, 05:35

Hi again,

Following #2 in this thread (https://www.statalist.org/forums/for...oup-regression), I realized that I had to interact all the control variables with the treatment_matched to arrive at equivalent point estimates for the interaction approach and the subgroup approach. However, the SE is slightly lower in the subgroup analysis (.1162794) than the interaction analysis (.118738), which aligns with my reviewer's claim that this approach has more statistical power. I know that Clyde Schechter usually say "that the difference between a statistically significant finding and a non-statistically significant finding is, itself, not statistically significant", so is there any other argument for choosing one approach over the other?

Best
Gustav
Comment
Jared Greathouse

Join Date: Sep 2021

Posts: 2170
#3

21 Dec 2022, 05:49

Concern yourself, always, with practicality.. If you go up to a city manager and say "Hey, I know this cool intervention that can save you 0.00004%, on average, on...... building roads or whatever city manager people do, see what their reaction will be after you tell them it is statistically significant. Either way, I haven't used CEM in a while, so I can't comment on the details, but subgroup analyses and interactions are two different things. If I'm doing the impact of raising the minimum wage on employment, I can have a very general effect (all restaurants in the treated/control cities), or I can do the same analysis, but only restricting my sample to French establishments or Chinese establishments, or establishments that make over a given amount of he median income.
Comment
ghoetker

Join Date: Mar 2014

Posts: 24
#4

21 Dec 2022, 14:56

I think you’d get some difference b/c you have one error term in the interaction model and two error terms in the sub-group model. Gujarati (1970), while…old…lays things out very clearly. Worth a quick read.

Gujarati, D. (1970). Use of dummy variables in testing for equality between sets of coefficients in linear regressions: a generalization. American Statistician, 24(5), 18–22.

_______________________________________

Glenn Hoetker
Professor in Business Strategy

Melbourne Business School, University of Melbourne
200 Leicester Street, Carlton, Victoria 3053, Australia
Email: [email protected]

I acknowledge the Traditional Owners of the land on which I work, the Wurundjeri people of the Kulin Nations, and pay my respects to their Elders, past and present.
Comment
Gustav Egede Hansen

Join Date: May 2021

Posts: 94
#5

28 Dec 2022, 04:19

Thanks you Jared and Glenn. It makes great sense. And thanks for the literature.
Comment

Announcement

Difference between interaction term or subgroup analysis when introducing control variables

Comment

Comment

Comment

Comment