Coefficients from -areg- and difference in predicted scores using -margins- do not match!

Min Oh

Join Date: Apr 2022

Posts: 8
#1

Coefficients from -areg- and difference in predicted scores using -margins- do not match!

25 Aug 2023, 10:52

Hello,

I'm having trouble troubleshooting why a coefficient I see in the -areg- output does not match the difference in the predicted values in the -margins- output.

Specifically, I used -areg- to run an interrupted time-series model to understand the impact of an event on test performance by participants' ethnicity, where post1 is the first year after the event I'm interested in (i.e., "i.post1#ib3.ethnicity" where the third ethnicity group, Hispanic, was the reference group). For reference, time=0 indicates when the event of interest occurred (i.e., post1 = (time=1)).

Code:

areg testscore time female i.post1##ib3.ethnicity i.post2##ib3.ethnicity disabled servicestatus i.post1##ib1.level i.post2##ib1.level p_a p_b p_h p_w p_ai p_nh p_esl if include_fulldata==1 [pweight=ipsw*sweight], vce(cluster participantid) absorb(schoolid)

The results show that the coefficient for "1.post1" is -12.83, which tells me that the event had an average effect of -12.83 point drop for Hispanic participants (i.e., baseline).

When I run -margins- to get the exact predicted test scores for Hispanic participants:

Code:

margins post1, over(ethnicity) at((means) time==1 post2=0)

In the -margins- output (format: ethnicity#post1), Stata returns 331.81 points at post1 assuming the event of interest did not happen (i.e., hispanic#0) and 322.64 points at post1 given that the event has happened (i.e., hispanic#1). I would expect that the difference between the two predicted scores would be -12.83 based on -areg- results; however, it is 9.17.

I cannot seem to figure out why the the two values do not match. I'd really appreciate any insight into this. Unfortunately, I can't share the data/parts of data due to confidentiality, but am ready to run diagnostics/different versions of the -margins- command as recommended from you all.

Last edited by Min Oh; 25 Aug 2023, 11:02.
Tags: None

Clyde Schechter

Join Date: Apr 2014
Posts: 30164

25 Aug 2023, 11:57

The discrepancy arises because post1, in addition to being interacted with ethnicity, is also interacted with level. When you have two interactions of the same variable, the coefficients take on different meanings than when there is only one interaction. In particular, the "main effect" of post1 now represents the effect of post 1 conditional on both ethnicity and level being at their base values. You can see this directly with this example:

Code:

sysuse auto, clear

keep if rep78 >= 3

summ mpg headroom trunk
regress price i.foreign##i.rep78 mpg headroom trunk

margins foreign, over(rep78) at((means) mpg = 15 headroom = 3 trunk = 10) post
lincom _b[3bn.rep78#1.foreign] - _b[3bn.rep78#0bn.foreign] // MATCHES COEFFICIENT OF foreign IN REGRESSION

regress price i.foreign##i.rep78 mpg headroom i.foreign##c.trunk
estimates store regression
margins foreign, over(rep78) at((means) mpg = 15 headroom = 3 trunk = 10) post
lincom _b[3bn.rep78#1.foreign] - _b[3bn.rep78#0bn.foreign] // DOES NOT MATCH COEFFICIENT OF foreign IN REGRESSION

estimates restore regression
margins foreign, over(rep78) at((means) mpg = 15 headroom = 3 trunk = 0) post // DOES MATCH
lincom _b[3bn.rep78#1.foreign] - _b[3bn.rep78#0bn.foreign]

Comment

Min Oh

Join Date: Apr 2022

Posts: 8
#3

25 Aug 2023, 12:18

Clyde, that was it! Thank you very much for your detailed explanation.
Comment

Announcement

Coefficients from -areg- and difference in predicted scores using -margins- do not match!

Comment

Comment