prais with rhotype(nagar) option

Chirok Han

Join Date: Jan 2024

Posts: 5
#1

prais with rhotype(nagar) option

03 Jan 2024, 23:03

The manual for -prais- says that the 'rhotype(nagar)' option computes rho by the formula (1) rho_nagar = (rho_dw*N^2 + k^2)/(N^2 - k^2). I cannot verify this. I run two Prais-Winsten regressions for the same model with the same data using rhotype(dw) and rhotype(nagar). Both are two-step estimators so rho is estimated one step using the same OLS residuals for both commands. Therefore, (1) should hold if the Manual is correct. We have

Code:

. webuse qsales, clear . prais csales isales, two rho(dw) . di e(rho) .63263622 . prais csales isales, two rho(nagar) . di e(rho) .6702161

For this data set and model, we have N = 20 and k = 2. I have checked it (1) is satisfied. It isn't:

Code:

. prais csales isales, two rho(dw) . scalar rho_dw = e(rho) . di (rho_dw*20^2 + 2^2)/(20^2 - 2^2) .64912749

Instead, the equality holds if k is replaced with k+1 = 3

Code:

. di (rho_dw*20^2 + 3^2)/(20^2 - 3^2) /* note 3^2 instead of 2^2 */ .67021608

How can I make sense of the discrepancy between the Manual and the experiment results?
Tags: None
Andrew Musau

Join Date: Oct 2014

Posts: 10235
#2

04 Jan 2024, 17:01

It depends of how you define k. In linear regression estimated using OLS, you normally refer to k as the number of slope parameters. In AR1 regression from

Code:

help arima

you have

arima stores the following in e():

Scalars
e(N) number of observations
e(N_gaps) number of gaps
e(k) number of parameters

So you would count the constant term as well.
Comment
Chirok Han

Join Date: Jan 2024

Posts: 5
#3

05 Jan 2024, 00:57

Thanks for your reply.

I would have thought the same way you did if it was k-1 (i.e., 1), but the problem is `rhotype(nagar)` uses 3 (which is 2 + # of slope parameters) when there is one non-constant variable (with the intercept included). I think it's either the manual or the code that is buggy.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10235
#4

05 Jan 2024, 01:13

The autocorrelation coefficient (rho) is a parameter. So estimated parameters in your example are \(\widehat{\beta}_0,\; \widehat{\beta}_1\) and \(\hat{\rho}\).

Last edited by Andrew Musau; 05 Jan 2024, 01:27.
Comment
Chirok Han

Join Date: Jan 2024

Posts: 5
#5

06 Jan 2024, 18:49

I've checked prais.ado, where k is erroneously computed as the number of all variables including the dependent variable, not just the regressors. I think it's a bug. Stata Corp. should attend it. The relevant lines of -prais.ado- are lines 111, 147-149, and 195-196.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10235

06 Jan 2024, 19:07

You should contact Technical Services at https://www.stata.com/support/tech-support/contact/ and inform them of your doubt. Regardless of how you interpret the command's ado file, my point in #2 remains. If you define k as the total number of parameters, then you have 3 as I show in #4. But there is no explicit definition in the prais documentation, which leaves it open to interpretation. Let's consider an example from arima where k is explicitly defined as the total number of parameters.

Code:

webuse friedman2, clear
arima consump m2, ar(1) nolog
display e(k)

Res.:

Code:

. webuse friedman2, clear

. arima consump m2, ar(1) nolog

ARIMA regression

Sample: 1959q1 thru 1998q3                      Number of obs     =        159
                                                Wald chi2(2)      =   36787.61
Log likelihood = -732.6621                      Prob > chi2       =     0.0000

------------------------------------------------------------------------------
             |                 OPG
     consump | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
consump      |
          m2 |    1.12214   .0464611    24.15   0.000     1.031078    1.213202
       _cons |   468.3314   567.6004     0.83   0.409    -644.1449    1580.808
-------------+----------------------------------------------------------------
ARMA         |
          ar |
         L1. |   .9989295   .0053841   185.53   0.000     .9883769    1.009482
-------------+----------------------------------------------------------------
      /sigma |    23.7996   .9726719    24.47   0.000      21.8932      25.706
------------------------------------------------------------------------------
Note: The test of the variance against zero is one sided, and the two-sided
      confidence interval is truncated at zero.

. display e(k)
4

Here, you have k=4, i.e., _b[consump:m2], _b[consump:_cons], _b[ARMA:L.ar] and _b[sigma:_cons].

Comment

Chirok Han

Join Date: Jan 2024

Posts: 5
#7

08 Jan 2024, 00:25

You mean rho is counted as a parameter, in which case k = 3 for simple linear regression. Thanks for the point, and sorry I misunderstood you.

I've checked the literature carefully as now I have access to my copy of Judge et al. (1985). TLDR: The "k" is the number of columns of X, so rho is not counted in.

Page 275, beginning of Section 8.1: "For the model y=Xbeta+e, where ... X is a (TxK) nonstochastic design metrix..."
Page 287, Section 8.2.1b: "Theil and Nagar (1961) suggest that the estimator rho^* = [T^2 (1-d/2) + K^2] / [T^2 - K^2] will be ..."

The definition of K didn't change between pages 275 and 287.

The Stata Manual is OK because it is unclear. The -prais- code is buggy if Judge et al. (1985) is correct. The -prais- ado file is coded so that the dependent variable is counted in, which is the source of the problem.

I hope that this list is attended by Stata Corp and the bug is fixed soon.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10235
#8

08 Jan 2024, 00:41

Page 275, beginning of Section 8.1: "For the model y=Xbeta+e, where ... X is a (TxK) nonstochastic design metrix..."
Page 287, Section 8.2.1b: "Theil and Nagar (1961) suggest that the estimator rho^* = [T^2 (1-d/2) + K^2] / [T^2 - K^2] will be ..."

Agreed, this suggests that \(K\) equals 2 in your example. It is not guaranteed that StataCorp will come across this thread, so what is recommended is to contact Technical Services. I will nevertheless tag Gustavo Sanchez (StataCorp).
Comment
Chirok Han

Join Date: Jan 2024

Posts: 5
#9

08 Jan 2024, 03:57

Thanks.
Comment

Announcement