Comparing coefficients across two models (same X data, but slightly different Ys)

Daniel Rascher

Join Date: Jul 2019

Posts: 2
#1

Comparing coefficients across two models (same X data, but slightly different Ys)

26 Jul 2019, 18:24

Hi Stata Forum,

I am checking if there is a simple way to compare two coefficients from xtreg. Take one model as Y1 = aa + b1X1 + dummies + e1 and the other model as Y2 = a2 + b2X1 + dummies + e2, where the X1 is exactly the same in both models, but the Y1 changes slightly to become Y2. I'd like to do a statistical test of whether b1=b2. This may include saving the coefficients after xtreg and then comparing them using some sort of ttest-like command or perhaps something more complex. Any information is appreciated.

Thanks,
Dan
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#2

26 Jul 2019, 18:53

The simplest way to do this is to not use -xtreg-, but rather to emulate it using -regress- and including absorbing your panel variable. Then store the estimates, use -suest- and run a test. So, with no example data and real variable name, I can't give you exact code, but here's the gist of it in pseudo-code:

Code:

regress Y1 X1 other_vars, absorb(panel_var) est store Y1 regress Y2 X1 other_vars, absorb(panel_var) est store Y2 suest Y1 Y2 test [Y1_mean = Y2_mean]: X1

That said, you probably shouldn't do this at all. Unless Y1 and Y2 are measured in the same units and have the same distribution (not just the same mean and standard deviation), the results of a test of equal coefficients is uninterpretable: it is impossible to say what is attributable to a difference in the strength of association and what is due to scaling issues. So this is usually a fool's errand. And standardizing Y1 and Y2 does not fully solve the problem, unless it results in identical frequency distributions for the standardized variables.

Assuming that you actually do have the same distributions for Y1 and Y2, bear in mind also that the American Statistical Association recommends no longer doing null hypothesis significance tests. See https://www.tandfonline.com/doi/full...5.2019.1583913. So better than the above would be to scrap the -test- command and instead use -lincom [Y1_mean]:X1 - [Y2_mean]:X1- to get an estimate of the coefficient differences along with the uncertainty in that (standard error, confidence interval).
4 likes
Comment
Daniel Rascher

Join Date: Jul 2019

Posts: 2
#3

26 Jul 2019, 20:56

Thanks. I think this will work. I've been using fixed effects. That last bit sounds good using lincom. The purpose of my analysis is to test a methodology that is commonly used. I am seeing whether it is sensitive enough to be able to detect actual changes in Y (which I have built in...thus both Ys are similar).
Comment
Nikita Sangwan

Join Date: Jul 2020

Posts: 2
#4

03 Apr 2021, 09:30

Originally posted by Clyde Schechter View Post

The simplest way to do this is to not use -xtreg-, but rather to emulate it using -regress- and including absorbing your panel variable. Then store the estimates, use -suest- and run a test. So, with no example data and real variable name, I can't give you exact code, but here's the gist of it in pseudo-code:

Code:

regress Y1 X1 other_vars, absorb(panel_var) est store Y1 regress Y2 X1 other_vars, absorb(panel_var) est store Y2 suest Y1 Y2 test [Y1_mean = Y2_mean]: X1

That said, you probably shouldn't do this at all. Unless Y1 and Y2 are measured in the same units and have the same distribution (not just the same mean and standard deviation), the results of a test of equal coefficients is uninterpretable: it is impossible to say what is attributable to a difference in the strength of association and what is due to scaling issues. So this is usually a fool's errand. And standardizing Y1 and Y2 does not fully solve the problem, unless it results in identical frequency distributions for the standardized variables.

Assuming that you actually do have the same distributions for Y1 and Y2, bear in mind also that the American Statistical Association recommends no longer doing null hypothesis significance tests. See https://www.tandfonline.com/doi/full...5.2019.1583913. So better than the above would be to scrap the -test- command and instead use -lincom [Y1_mean]:X1 - [Y2_mean]:X1- to get an estimate of the coefficient differences along with the uncertainty in that (standard error, confidence interval).

Dear Clyde,

I tried suest with clustered standard errors and lincom command as suggested by you. But the standard errors from using regress with clustered standard errors are different from the ones that I get from suest. Is there a way to get around this problem?
Alternatively, I also tried using GMM but given that I have over 1000 panel IDs the command returns the error "could not evaluate equation 1".

Any suggestions or references in this regard would be very helpful.

Regards,
Nikita
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#5

03 Apr 2021, 10:27

I'm not sure what you did, and you don't show any specifics. The best I can tell you is that when you use -suest-, you should not apply clustered standard errors to the original -regress- commands. In fact, if you try to feed regressions with clustered standard errors into -suest- it will reject them. Run the regressions with "ordinary" standard errors, and then specify clustered standard errors in the -suest- command itself.

If you did that, and are then comparing the results to separately estimating the two regressions with clustered standard errors, then, no, the results will not be the same, and you should not expect them to be. They are different things altogether. -suest- accounts for covariance across equations. By contrast, when you run each regression separately it is, of course, impossible to account for cross model covariance.

I don't use GMM myself and know very little about it, so I can't advise you on that.
1 like
Comment
Nikita Sangwan

Join Date: Jul 2020

Posts: 2
#6

05 Apr 2021, 21:06

Thank you, Clyde!

I understand that with a joint estimation as the number of observations are increased the standard errors may be different.
But shouldn't the standard errors be the same in the following two scenarios:

Case (1): reg y x, vce(cluster clust_var)

Case (2): reg y x
estimates store eq_1
suest eq_1, vce(cluster clust_var)

Because even these are different.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#7

06 Apr 2021, 10:32

Hmm, I'm surprised and I don't know how to explain it.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10265
#8

06 Apr 2021, 13:12

Originally posted by Nikita Sangwan View Post

Thank you, Clyde!

I understand that with a joint estimation as the number of observations are increased the standard errors may be different.
But shouldn't the standard errors be the same in the following two scenarios:

Case (1): reg y x, vce(cluster clust_var)

Case (2): reg y x
estimates store eq_1
suest eq_1, vce(cluster clust_var)

Because even these are different.

Surely not! Perhaps you are tripped by the fact that suest outputs z-statistics and regress outputs t-statistics. In any case, present a data example illustrating your point.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#9

06 Apr 2021, 13:45

Ah yes, Andrew, thank you!
Comment
Sally Ahmed

Join Date: Dec 2020

Posts: 59
#10

02 Feb 2022, 12:49

Hello, I have a similar question, I have two measures for Y and the same X so I want to compare the coefficients of the two measures for Y to see if the difference between them is significant and also which measure of Y, X has the greatest effect on. So I am looking for a ttest or something. I do not know which command is more appropriate in my case also I am a bit confused regarding the lincom command. I would appreciate your help. Here is a hint of the models I want to run

Y1= SIZE + CSR + LEV
Y2= SIZE + CSR + LEV

Y1 and Y2 are two different measures for a same variable. so I want to see if CSR has more siginifcant effect on Y1 or Y2 and if the difference between the two coefficients is significant.
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#11

02 Feb 2022, 13:26

Here is an example of how it is done using the auto.dta file as an example. This particular example is nonsensical, but it shows how you would use the -regress-,-estimates store-, -suest-, and -lincom- commands to accomplish it.

Code:

sysuse auto, clear regress price mpg headroom trunk estimates store Y1 regress displacement mpg headroom trunk estimates store Y2 suest Y1 Y2 test [Y1_mean = Y2_mean]:headroom

That said, be warned that the size of regression coefficients is not a measure of their importance or effectiveness in any real-world sense. Coefficients are strongly influenced by the overall scale of the variables involved, and even when scaled the same, can be influenced by distributions. So be very cautious in interpreting any comparison of regression coefficients. Make sure you first check that the variables in the two regressions are on the same scale and that their distributions are strongly similar.
1 like
Comment
Sally Ahmed

Join Date: Dec 2020

Posts: 59
#12

02 Feb 2022, 13:42

Hi Clyde,

Thanks for your reply. I run the test and this the error I have ''Y1 was estimated with a nonstandard vce (robust)''

Also, I really want to know if the difference between them is significant or not is there any way to do so?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30163
#13

02 Feb 2022, 14:13

You can't use robust vce in the regressions before you submit them to -suest-. Re-run the regressions without that, and then run -suest Y1 Y2, vce(robust)- and you will have what you are looking for.

Also, I really want to know if the difference between them is significant or not is there any way to do so?

The output of -lincom- includes a z-test and p-value, though I will spare you my long rant about why I think you should ignore them.
Comment

Koula Lysi

Join Date: Aug 2021
Posts: 11

#14

26 Mar 2022, 05:51

I have read a lot Q&A regarding comparing coefficients across groups using suest. I followed a similar approach to show that within a constant reserveprice, the estimated coefficients (which shows the marginal effect of privatevalue on bid) for the different groups of measrisk (categorial variable taking the values of 1, 2 and 3) are not equal. However, what I would like to show is that there is a trend of the estimated coefficients to increase as the level of measrisk increases (similar to what the command nptrend does, but on estimated coefficients). Is there any way to do something like this? A sample of the data set is shown. Any help would be really appreciated.

Code:

clear
input byte(reserveprice bid privatevalue) float measrisk
0 78 96 3
0 51 76 1
0 1 1 3
0 55 56 3
0 71 71 1
0 11 15 3
0 38 44 2
0 12 17 1
0 20 21 1
0 16 36 1
0 49 52 3
0 28 43 2
0 60 65 1
0 76 93 2
0 65 88 1
0 54 71 3
0 14 28 1
0 57 91 1
0 65 71 1
0 50 52 1
0 60 70 3
0 80 86 3
0 31 44 1
0 25 32 3
0 57 71 2
0 43 46 1
0 58 68 1
0 20 23 3
0 55 68 1
0 48 58 2
0 60 80 1
0 27 28 2
0 5 77 1
0 85 95 3
0 5 8 1
0 40 46 2
0 60 69 3
0 40 44 1
0 25 30 1
0 76 96 3
0 35 97 3
0 49 67 1
0 65 80 1
0 15 19 3
0 27 29 3
0 35 71 1
0 10 14 2
0 65 87 3
0 10 15 1
0 25 26 3
0 3 4 1
0 20 47 3
0 35 42 2
0 23 23 2
0 51 56 3
0 27 36 1
0 49 56 3
0 73 98 3
0 12 15 3
0 30 32 3
0 51 71 1
0 10 14 1
0 15 17 3
0 64 72 3
0 65 74 1
0 36 44 3
0 25 54 1
0 20 30 3
0 16 23 3
0 49 65 2
0 42 51 3
0 20 27 1
0 75 83 3
0 43 53 1
0 3 4 2
0 43 55 1
0 5 6 1
0 57 66 2
0 14 26 3
0 73 93 1
0 52 74 3
0 20 48 2
0 50 85 2
0 7 11 1
0 45 59 1
0 74 85 1
0 51 52 2
0 75 88 1
0 33 36 2
0 50 74 3
0 50 98 1
0 61 81 3
0 30 46 2
0 68 91 1
0 35 91 2
0 57 77 1
0 20 34 1
0 60 70 1
0 21 54 1
0 47 98 2
end
label values reserveprice RP
label def RP 0 "RP=0", modify
label values measrisk measrisk
label def measrisk 1 "LRA", modify
label def measrisk 2 "MRA", modify
label def measrisk 3 "HRA", modify

Code:

reg bid privatevalue if reserveprice==0 & measrisk==1, noconstant
estimates store one
reg bid privatevalue if reserveprice==0 & measrisk==2, noconstant
estimates store two
reg bid privatevalue if reserveprice==0 & measrisk==3, noconstant
estimates store three
suest one two three
test [one_mean]privatevalue=[two_mean]privatevalue
test [two_mean]privatevalue=[three_mean]privatevalue, accum

Last edited by Koula Lysi; 26 Mar 2022, 05:59.

Comment

Richard Williams

Join Date: Apr 2014

Posts: 5019
#15

26 Mar 2022, 06:34

Mize, Doan & Long have an excellent paper entitled A General Framework for Comparing Predictions and Marginal Effects across Models that may be helpful:

https://journals.sagepub.com/doi/ful...81175019852763

In this article, the authors provide a general framework for comparing predictions and marginal effects across models using seemingly unrelated estimation to combine estimates from multiple models, which allows tests of the equality of predictions and effects across models. The authors illustrate their method to compare nested models, to compare effects on different dependent or independent variables, to compare results from different samples or groups within one sample, and to assess results from different types of models.

I've encouraged Mize to write a Stata routine to make his methods easier to estimate. In the meantime, the code used in the paper is linked to at https://www.trentonmize.com/research

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
Comment

Announcement