Test for the difference between two point predictions obtained from separate regressions

Luca Uberti

Join Date: Aug 2016

Posts: 33
#1

Test for the difference between two point predictions obtained from separate regressions

24 Oct 2020, 14:11

Dear colleagues,

I would like to perform a statistical test of the difference between two point predictions obtained from separate regressions, but I don't seem to know how to go about it. This would seem to be a pretty basic exercise, but I seem also to ignore the statistics behind it, so I would very much appreciate someone's help.

To illustrate, consider this dataset:

Code:

clear set seed 1234 set obs 1000 drawnorm x e gen y1 = 2*x + 3 + e gen y2 = 3*x + 4 + e

I run these two regressions, and I obtain the fitted values (and, for good measure, the standard error of the fitted values). To illustrate my question, let me also plot the regression lines.

Code:

reg y1 x predict y1h, xb predict se1, stdp * reg y2 x predict y2h, xb predict se2, stdp * twoway (line y1h x, sort) (line y2h x, sort)

I would like to test the null hypothesis that the difference between the conditional mean of y1 and y2, given (say) x = 2, is equal to zero. So, formally, H0: "E[y1|x=2] - E[y2|x=2] = 0".

How should I proceed?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

24 Oct 2020, 14:42

Here's an illustration of the approach using the built-in auto data:

Code:

sysuse auto, clear regress headroom price estimates store headroom regress trunk price estimates store trunk suest trunk headroom test [trunk_mean = headroom_mean]:price

Note: -suest- works with most estimation commands, but there are some that it does not support. Also, there are issues about how to use it with robust variance estimation or survey data. So I urge you to read the section on -suest- in the PDF manuals that came installed with your Stata before you use it with anything more complicated than a "vanilla" -regress- command.

All of that said, wouldn't it be more useful to get an estimate of the difference between the two coefficients, with a confidence interval, instead of a p-value for a test of a straw-man null hypothesis? Presumably you care about how different the coefficients are, not the probability that you would see results like the ones you got if they were the same, right?

Code:

lincom _b[trunk_mean:price] - _b[headroom_mean:price] // AFTER -suest-

Last edited by Clyde Schechter; 24 Oct 2020, 14:47.
Comment
Luca Uberti

Join Date: Aug 2016

Posts: 33
#3

24 Oct 2020, 15:33

Thank you for your quick response. >suest< can be used to test cross-model hypotheses concerning the estimated regression coefficients. For instance, to go back to my vanilla example, we could test, and obviously reject, the null that my two regression lines have equal slopes:

Code:

qui reg y1 x est store one qui reg y2 x est store two suest one two test [one_mean]x - [two_mean]x = 0

My question, however, was about testing the equality of the fitted values. So, when x = -1 (in my example), the difference between the fitted values is (y2 - y1) = 0. But when x = 2, the difference is (y2 - y1) = 3. Is this difference statistically distinguishable from 0? Ostensibly yes, of course, but I would like to perform a formal test (which I don't think we can accomplish with >suest<).

I am not sure if this clarifies what I am after..

(I don't think this is a straw-man null. For instance, we could want to know whether, based on the sample evidence, we can conclude that, given x = 0.9, the difference between the predicted values (y2 - y1) = 0.1 is actually distinguishable from 0)

Last edited by Luca Uberti; 24 Oct 2020, 16:00.
Comment
Luca Uberti

Join Date: Aug 2016

Posts: 33
#4

24 Oct 2020, 16:28

Small correction (to make my example above more realistic, I should have generated y1 and y2 so that they incorporate two different idiosyncratic components)

Code:

clear set seed 1234 set obs 1000 drawnorm x e u gen y1 = 2*x + 3 + e gen y2 = 3*x + 4 + u reg y1 x predict y1h, xb reg y2 x predict y2h, xb twoway (line y1h x, sort) (line y2h x, sort), xline(2)

Based on the estimates, E(y2 - y1 | x = 2) = (3.052173 - 1.965255)*2 + (4.023886 - 2.954644) = 3.243078

I would like to perform a statistical test to answer this question:

Based on the sample evidence, is the estimated difference between the conditional means given x = 2: a. statistically distinguishable from 0 (meaning y1 and y2 are different); b. statistically distinguishable from 3 (the true value of the difference)?

Last edited by Luca Uberti; 24 Oct 2020, 16:48.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#5

24 Oct 2020, 17:52

I'm sorry, but I don't understand what you want to do here. What you say in #r seems clear enough. But in #3 you talk about comparisons of fitted values--which has nothing at all to do with what appears in #4. Nor do I say any role at all for -predict- (nor for the variables y1h and y2h) in resolving the question in #4. So I don't know what you are getting at.

I'm just going to ignore #3 and the use of -predict- and solve the question posed in #4. If that's not what you want, do post back with clarifications.

Code:

clear set seed 1234 set obs 1000 drawnorm x e u gen y1 = 2*x + 3 + e gen y2 = 3*x + 4 + u reg y1 x estimates store one reg y2 x estimates store two suest one two // WHAT IS E(y2-y1 | X = 2)? lincom 2*(_b[two_mean:x]-_b[one_mean:x]) + _b[two_mean:_cons] - _b[one_mean:_cons]

The end result of this is:

Code:

. lincom 2*(_b[two_mean:x]-_b[one_mean:x]) + _b[two_mean:_cons] - _b[one_mean:_cons] ( 1) - 2*[one_mean]x - [one_mean]_cons + 2*[two_mean]x + [two_mean]_cons = 0 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | 3.160683 .098308 32.15 0.000 2.968003 3.353364 ------------------------------------------------------------------------------

So this is your estimate of E(y2-y1 | x = 2). You can clearly see that the 95% confidence interval includes 3, and in fact is pretty tight around 3. You can clearly reject the hypothesis that this difference is zero. And you clearly cannot reject the hypothesis that this is 3, at least not at the .05 significance level.

If you specifically want a test of the hypothesis that this difference equals 3 you can get that with:

Code:

. lincom 2*(_b[two_mean:x]-_b[one_mean:x]) + _b[two_mean:_cons] - _b[one_mean:_cons] - 3 ( 1) - 2*[one_mean]x - [one_mean]_cons + 2*[two_mean]x + [two_mean]_cons = 3 ------------------------------------------------------------------------------ | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- (1) | .1606834 .098308 1.63 0.102 -.0319967 .3533636 ------------------------------------------------------------------------------

But be super-cautious in interpreting this one. I don't know the context in which you plan to apply this. In the example we have been discussing in this thread, the data are literally sampled from a population in which E(y2-y1 | x = 2) = 3, exactly. That is, the null hypothesis H0: E(y2-y1 | x = 2) = 3 is true by construction. Therefore, if you had happened to get a sample in which the result gave you p < 0.05, it would be absolutely wrong to reject the null hypothesis. That null hypothesis is true, and no random sample changes that fact. The only legitimate conclusion would be that the sample was a fluke--by definition, it is a Type I error. (If you are planning to apply this code to data whose true data generating process is not known, then you don't have to worry about this.)
Comment
Luca Uberti

Join Date: Aug 2016

Posts: 33
#6

25 Oct 2020, 13:38

Thank you for your response and illustrations. That's indeed what I wanted! Pretty simple, but it just did not occur to me.. (Also, yes, in my application the true DGP is not known. The example was just intended to simplify things..).

Can I ask you an additional question - about your remark that comparisons of fitted values have nothing at all to do with my question in #4? Conceptually, I understand that. In practice, however, I may be interested in obtaining, and testing, the conditional mean difference given a value of x which was actually observed in the sample - for instance, my obs. n. 323:

Code:

clear set seed 1234 set obs 1000 drawnorm x e u gen y1 = 2*x + 3 + e gen y2 = 3*x + 4 + u gen id = _n reg y1 x estimates store one reg y2 x estimates store two list id x if id==323

I could test that the conditional mean difference E(y2-y1|x=2.004878) is equal to zero by calling a cross-model linear combination of parameters, as you suggest:

Code:

suest one two lincom 2.004878*(_b[two_mean:x]-_b[one_mean:x]) + _b[two_mean:_cons] - _b[one_mean:_cons]

Now my question - could I also perform a simple t-test of the difference between the two predicted conditional means, as follows?

Code:

qui reg y1 x predict y1h, xb predict se1, stdp gen sd1 = se1*sqrt(1000) qui reg y2 x predict y2h, xb predict se2, stdp gen sd2 = se2*sqrt(1000) list id x y2h sd2 y1h sd1 if id==323 ttesti 1000 10.14312 2.194182 1000 6.89474 2.258151, unequal

The estimated t-statistics (32.67 and 32.62, respectively) are very close but not identical. Is this because >lincom< uses a slightly different method to compute the standard errors than >predict, stdp<? Or there is a more important difference, statistically speaking, between these two approaches?

(I am asking this question because, in my actual application, I don't use >reg< but >xtabond2<, which does not seem to support >suest<, so I am trying to find a workaround..)

Last edited by Luca Uberti; 25 Oct 2020, 13:44.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#7

25 Oct 2020, 13:54

No, the method you show using -predict- is something different. It is almost coincidental that they came out close. The standard errors being calculated by -lincom- are the correct standard errors for your stated question. The standard errors, -stdp-, generated by -predict- are the sampling standard deviations for predicting the outcome of a single observation with that value of x. It's a very different beast.

Also, apart from the standard errors, the predicted value itself in a single observation will be, in general, affected by the values of the other variables in your model (in a real life situation where it isn't just a regression of y against a single x) and thus contain idiosyncratic contributions from these other variables.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#8

25 Oct 2020, 14:12

Code:

clear set seed 1234 set obs 1000 drawnorm x e u gen y1 = 2*x + 3 + e gen y2 = 3*x + 4 + u regress y1 x regress y2 x gen long obs_no = _n reshape long y, i(obs_no) j(which_y) xtset obs_no which_y xtreg y i.which_y##c.x, fe // CALCULATE EXPECTED VALUE OF Y2-Y1 | X = 2 lincom 2*_b[2.which_y#c.x] + _b[2.which_y]

The -suest- approach does not support -xtabond2-, you are right. The above is a generic approach that I think will work with -xtabond2-. I do not myself use -xtabond2- and I don't really know much about Arellano-Bond estimation, so there may be reasons why this calculation would not be valid for it. But off hand, this really should work with pretty much any regression model.

Added: In applying this to a more complicated model, i.which would have to be interacted with all of the predictor variables.

Last edited by Clyde Schechter; 25 Oct 2020, 14:16.
Comment
Luca Uberti

Join Date: Aug 2016

Posts: 33
#9

25 Oct 2020, 14:48

Dear Clyde, thank you for your comments and suggestions. The approach you suggest in #8 is clever (we could use it in our application, but a lot of the code would have to change for reasons that are too long to explain here..)

As for #7, I thought that the sampling standard deviations for predicting the outcome of a single sample observation for a given value of x were given by -predict, stdf-. The Stata documentation for -predict- indicates that "the statistic produced by stdp can be thought of as the standard error of the predicted expected value", while that produced by -stdf- as "the standard error of the point prediction", that is, the statistic for the prediction of a single outcome value. Shouldn't the statistics produced by -stdp- coincide with those produced by -lincom-?

Yes, in real life situations, the predictions for a single observation will contain the contributions from the other variables, but these contributions may be purged by using the user-written command -partpred- (for partial predictions), instead of using -predict-.

Last edited by Luca Uberti; 25 Oct 2020, 15:28.
Comment
Luca Uberti

Join Date: Aug 2016

Posts: 33
#10

25 Oct 2020, 15:11

Actually, it does look like the s.e. calculated by -predict- are the same as those calculated by -lincom- and, indeed, by -margins-

Code:

clear set seed 1234 set obs 1000 drawnorm x e gen y = 2*x + 3 + e gen id = _n reg y x predict yhat, xb predict se, stdp list id x y yhat se if id==323 *lincom qui reg y x lincom _b[x]*2.004878 + _b[_cons] *margins margins, at(x=2.004878)

Is the discrepancy in my previous code due to the fact that I have to compute the t-test 'manually' (with -ttesti-) and hence lose a few decimals?

(I am using Stata 13)
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30100

#11

25 Oct 2020, 15:50

Probably so.

I'm not surprised that the standard errors from -margins- and -predict- are the same: they're supposed to be. And yes, -predict, stdp- does use the same standard errors as -lincom-. For some reason I thought your code used -stdf- instead. I could have sworn I saw that, but I don't see it anywhere now. My mistake.

But don't rejoice to quickly about the agreement between -margins- and -lincom- here. That works only because you have only one predictor variable, x. If there are several predictors (including, I gather, lags of x) then these will no longer agree:

Code:

. sysuse auto, clear
(1978 Automobile Data)

. regress price mpg headroom

      Source |       SS           df       MS      Number of obs   =        74
-------------+----------------------------------   F(2, 71)        =     10.44
       Model |   144280501         2  72140250.4   Prob > F        =    0.0001
    Residual |   490784895        71  6912463.32   R-squared       =    0.2272
-------------+----------------------------------   Adj R-squared   =    0.2054
       Total |   635065396        73  8699525.97   Root MSE        =    2629.2

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -259.1057   58.42485    -4.43   0.000    -375.6015   -142.6098
    headroom |  -334.0215   399.5499    -0.84   0.406    -1130.701    462.6585
       _cons |   12683.31   2074.497     6.11   0.000     8546.885    16819.74
------------------------------------------------------------------------------

. lincom _b[mpg]*2 + _cons

 ( 1)  2*mpg + _cons = 0

------------------------------------------------------------------------------
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |    12165.1   1977.561     6.15   0.000     8221.959    16108.25
------------------------------------------------------------------------------

. margins, at(mpg = 2)

Predictive margins                              Number of obs     =         74
Model VCE    : OLS

Expression   : Linear prediction, predict()
at           : mpg             =           2

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _cons |    11165.3   1168.134     9.56   0.000     8836.103    13494.49
------------------------------------------------------------------------------

In any case I don't see how -margins- can help you with these cross-regression comparisons unless you have reduced the two regressions to one along the lines of #8.

I can't think of any other approach, I'm afraid. Perhaps somebody else who is following the thread can, I hope.

Comment

Luca Uberti

Join Date: Aug 2016

Posts: 33
#12

25 Oct 2020, 16:09

Dear Clyde, this conversation was most useful, and I will probably try something along the lines of #8, which looks like the most elegant option.

As for your example, the two commands don't agree just because -lincom-, unlike -margins- does not average over the remaining covariates, nor does it hold them constant at their means. The contribution of the other covariates has to be entered manually:

Code:

sysuse auto, clear qui regress price mpg headroom margins, at(mpg = 2) atmeans lincom _b[mpg]*2 + _b[headroom]*2.993243+_cons

Thanks again!
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#13

25 Oct 2020, 16:10

Yes, that's exactly right.
Comment

Announcement