Understanding Stata's post estimation test command

Susan Averett

Join Date: Sep 2020

Posts: 4
#1

Understanding Stata's post estimation test command

24 Sep 2020, 07:34

I have a dataset with 25 observations. The variables are y, k and l.
I run the following regression
regress y k l
I want to test that the parameters on k and l sum to 10
I type:

test k+l=10

( 1) k + l = 10

F( 1, 22) = 107.68
Prob > F = 0.0000

.
My issue is that when I create the restricted regression and calculate the F-statistic using the usual formula ((R²_unrestricted-R²_restricted)/q)/((1-R²_unrestricted)/(n-k)) I get a much smaller number...still rejecting the null hypothesis

my restricted model is y-10L=B₀+B₁(k-l)+E
. gen Y_10L = y-10*l

. gen k_l=k-l

. regress Y_10L k_l

and I don't get a number that is equal to 107, I get around 47 using the above formula.

I am just curious about what is going on. Is it a sample size issue?

My data is here:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(y k l)
68.24103 9.313 45.0961
69.15226 10.6264 43.9693
70.05865 11.5423 41.8166
79.74423 11.9624 44.4985
89.43562 12.2972 48.7602
93.72334 13.045 51.1402
102.4281 13.6777 54.4577
95.21549 14.2198 51.2944
108.347 14.7225 54.0984
108.1351 15.1736 55.7854
107.1986 16.0311 55.9122
100.4691 16.8214 52.6973
109.6668 16.9557 56.4288
115.2529 16.9042 56.9827
116.6837 17.1108 56.0163
129.3971 17.2227 58.5997
132.359 17.4505 59.6128
147.1149 17.8079 61.1658
159.5804 18.4595 64.6947
173.8529 19.6165 69.2726
175.291 21.2163 70.161
184.5142 22.4894 72.3024
196.5472 23.5281 74.2756
183.8358 24.7325 71.2039
177.0066 25.6062 68.9305
end
[/CODE]
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

24 Sep 2020, 08:43

Perhaps the difference is because the test command performs a Wald test, while the F-statistic you are computing appears to be a likelihood ratio test statistic. (I'm not familiar with "the usual formula" you give, but that's what it looks like to me.)
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10213

24 Sep 2020, 10:04

I agree with William Lisowski, you do not have nested models to perform a likelihood ratio test.

test k+l=10

This is just a linear combination of coefficients and a constant, that is k+l-10. Using lincom, you can get the same result as test, albeit with a t-statistic instead of F. Noting \(\text{t}=\sqrt{\text{F}}\)

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input float(y k l)
68.24103   9.313 45.0961
69.15226 10.6264 43.9693
70.05865 11.5423 41.8166
79.74423 11.9624 44.4985
89.43562 12.2972 48.7602
93.72334  13.045 51.1402
102.4281 13.6777 54.4577
95.21549 14.2198 51.2944
 108.347 14.7225 54.0984
108.1351 15.1736 55.7854
107.1986 16.0311 55.9122
100.4691 16.8214 52.6973
109.6668 16.9557 56.4288
115.2529 16.9042 56.9827
116.6837 17.1108 56.0163
129.3971 17.2227 58.5997
 132.359 17.4505 59.6128
147.1149 17.8079 61.1658
159.5804 18.4595 64.6947
173.8529 19.6165 69.2726
 175.291 21.2163  70.161
184.5142 22.4894 72.3024
196.5472 23.5281 74.2756
183.8358 24.7325 71.2039
177.0066 25.6062 68.9305
end

regress y k l
test  k+l=10
lincom k+l-10
di (-10.38)^2

Res.:

Code:


.
. test  k+l=10

 ( 1)  k + l = 10

       F(  1,    22) =  107.68
            Prob > F =    0.0000

.
. lincom k+l-10

 ( 1)  k + l = 10

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |  -5.156307   .4968966   -10.38   0.000    -6.186807   -4.125806
------------------------------------------------------------------------------

.
. di (-10.38)^2
107.7444

Now the question would be how does lincom compute the t-statistic. As t=coefficient/standard error, the coefficient is simply the sum of the coefficients

Code:

di _b[k]+_b[l]-10

Res.:

Code:

. di _b[k]+_b[l]-10
-5.1563067

The standard error is obtained using the usual formula [Var(a+ b)= Var(a)+ Var(b)+ 2CoVar(ab)]. The variance of the constant term is 0, so it drops out of the equation.

Code:

mat l e(V)
di sqrt((e(V)[1,1]) + (e(V)[2,2]) + (2*e(V)[1,2]))

Res.:

Code:

symmetric e(V)[3,3]
                k           l       _cons
    k   .72222981
    l  -.31387297    .1524224
_cons   5.9776296  -3.5198448   103.93004

.
. di sqrt((e(V)[1,1]) + (e(V)[2,2]) + (2*e(V)[1,2]))
.49689665

Last edited by Andrew Musau; 24 Sep 2020, 10:09.

Comment

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#4

24 Sep 2020, 10:36

The way how you impose the constraint is correct.

You should not be using the R-squared version of the test when your dependent variable changes under the new reparametrisation. However I am also not getting the same result when I use

F = (RSSr - RSSu)/(RSSu/(n-k))

Here is what happens when I use the Residual Sum of Squares version of the test:

Code:

. gen kl = k - l . gen yy = y - 10*l . qui reg y k l . test k+l=10 ( 1) k + l = 10 F( 1, 22) = 107.68 Prob > F = 0.0000 . sca SSRu = e(rss) . sca DF = e(df_r) . qui reg yy kl . dis (e(rss)-SSRu )/(SSRu/e(df_r)) 112.57722

I do not know what is going on here. I know where your problem is (you should not use the R-squared version of the test), but I cannot see where my problem is.
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#5

24 Sep 2020, 10:57

@Joro Kolev: your degrees for freedom in the last line is taken from the restricted regression. It should come from the unrestricted regression which you saved as DF
On edit:
. display (4463.4569 - 757.20324)/(757.20324 / 22)
107.68256

Last edited by Eric de Souza; 24 Sep 2020, 11:08.
1 like
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#6

24 Sep 2020, 11:05

You are right, Eric. Of course this is why I am saving the scalar DF after the unrestricted regression on command line 6. So when I have saved it, I should also use it after that, it does not help if I only save it and keep it :-).

So everything is fine when we set up the formula correctly with RSS:

Code:

. dis (e(rss)-SSRu )/(SSRu/DF) 107.68256

the two tests are numerically equivalent.

Originally posted by Eric de Souza View Post

@Joro Kolev: your degrees for freedom in the last line is taken from the restricted regression. It should come from the unrestricted regression which you saved as DF
1 like
Comment
Susan Averett

Join Date: Sep 2020

Posts: 4
#7

29 Sep 2020, 09:16

Thanks for all the help. Much appreciated.
Comment

Announcement