Newey Regression for panel data

Najia Mughal

Join Date: Jul 2018

Posts: 7
#1

Newey Regression for panel data

19 Dec 2018, 15:11

Hello,

I wanted to know whether the newey-west command can be used for panel data or not? If no, then how can i get newey-west standard errors for panel data?

Regards,
Najia
Tags: None

Joro Kolev

Join Date: Aug 2018
Posts: 3050

20 Dec 2018, 05:12

Yes, you can use -newey- on panel data, it is a but of a mystery why this is not documented in the manual, but I am showing below how to do it.

Curiously enough, this use of -newey- on panel data is not documented in the manual.

So I also have a question to anybody who knows anything about this mystery: Why is this use and the -force- option of -newey- not documented in the manual of Stata at least from version 11 up?

( I also have a question to myself: How do I know that -newey- can be used like that? My only guess is that this is something coming from old Stata, such as Stata 7, and the -force- option was documented in Stata 7... I need to check.)

The key to making -newey- work on panel data is the -force- option. Without the -force- option Stata issues an error message and refuses to calculate the estimator.

Code:

. webuse grunfeld , clear

. tsset company year
       panel variable:  company (strongly balanced)
        time variable:  year, 1935 to 1954
                delta:  1 year

. newey invest mvalue kstock, lag(3)
year is not regularly spaced
r(198);

. newey invest mvalue kstock, lag(3) force

Regression with Newey-West standard errors      Number of obs     =        200
maximum lag: 3                                  F(  2,       197) =      79.67
                                                Prob > F          =     0.0000

------------------------------------------------------------------------------
             |             Newey-West
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1155622   .0107291    10.77   0.000     .0944035    .1367209
      kstock |   .2306785   .0662771     3.48   0.001     .0999749    .3613821
       _cons |  -42.71437   15.73997    -2.71   0.007    -73.75483    -11.6739
------------------------------------------------------------------------------

.

Comment

Najia Mughal

Join Date: Jul 2018

Posts: 7
#3

20 Dec 2018, 11:11

Thanks alot Kolev,It worked for me. I have also been looking for this command in the manual for so long but couldn't find. And sorry, I don't know the answer to your questions.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#4

20 Dec 2018, 12:54

Dear Najia Mughal and Joro Kolev,

This is very interesting, but I struggle to understand why one would want to use Newey-West standard errors in this context and how that would work. Why not use the popular clustered standard errors or the Driscoll-Kraay standard errors available via xtscc?

Best wishes,

Joao
1 like
Comment
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#5

20 Dec 2018, 13:06

Joao Santos Silva , one would want to use Newey-West variance on panel data in a context in which one is ready to assume independence in the cross section, however one wants to guard against heteroskedasticity and autocorrelation in the time series dimension.

It works the same way as it works for time series data, except that now you have multiple time series, one time series for each cross sectional unit. And again, we are assuming that the multiple time series are not correlated with each other.

Driscoll-Kraay standard errors available via xtscc are necessary/appropriate if you want to guard against heteroskedasticity and auto correlation in the time series dimension (as we did with -newey- above), but on the top of this you are not ready to assume that the cross sectional units are uncorrelated. In other words Driscoll-Kraay standard errors guard against both correlation in the time series and in the cross sectional dimension.
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3011
#6

20 Dec 2018, 14:14

Thanks, Joro Kolev.

Yes, the DK standard errors have the advantage of not requiring independent units, so why use NW? If we are willing to assume independence, clustered standard errors are robust against general forms of serial correlation and therefore also more robust (assuming that N is large); again, why use NW?

Best wishes,

Joao
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#7

20 Dec 2018, 14:31

Even if we accept that one can apply Newey to panel data, my question would be how do you account for the possibility of different optimal lag lengths (since you have independent cross-sections and multiple time series)?
Comment

Joro Kolev

Join Date: Aug 2018
Posts: 3050

20 Dec 2018, 17:05

Joao Santos Silva everywhere in life, including statistics and econometrics, the less you assume, the less you can deliver. It we go along your line of reasoning, we can ask similar questions:

Why assume linearity and do linear regression, when some more complicated nonlinear function "has the advantage" over linear function of greater generality? Why assume any function at all, when non-parametric regression "has the advantage" of not assuming function at all? Why use non parametric regression, when we can just just stare at our data? Staring at our data all data points at the same time, "has the advantage" that does not impose any structure on our data, it just takes our data as it is.

Generally speaking, Driscoll-Kraay standard errors do not have an advantage over Newey-West standard errors for panel data. The former assume less than the latter, and might deliver less.

Delivering less might be reflected in worse sized tests, or in larger standard errors.

In general, to get anywhere, we need to assume something.

Or it might not be much of a difference, and then we can ask the reverse question to yours: If both methods give us the same answer, where exactly is the advantage of one over the other.

In the example below, it does not seem to me much of a difference:

Code:

. webuse grunfeld , clear

. newey invest mvalue kstock, lag(3) force

Regression with Newey-West standard errors      Number of obs     =        200
maximum lag: 3                                  F(  2,       197) =      79.67
                                                Prob > F          =     0.0000

------------------------------------------------------------------------------
             |             Newey-West
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1155622   .0107291    10.77   0.000     .0944035    .1367209
      kstock |   .2306785   .0662771     3.48   0.001     .0999749    .3613821
       _cons |  -42.71437   15.73997    -2.71   0.007    -73.75483    -11.6739
------------------------------------------------------------------------------

. xtscc invest mvalue kstock, lag(3)

Regression with Driscoll-Kraay standard errors   Number of obs     =       200
Method: Pooled OLS                               Number of groups  =        10
Group variable (i): company                      F(  2,    19)     =     93.20
maximum lag: 3                                   Prob > F          =    0.0000
                                                 R-squared         =    0.8124
                                                 Root MSE          =   94.4084

------------------------------------------------------------------------------
             |             Drisc/Kraay
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1155622   .0130406     8.86   0.000     .0882679    .1428564
      kstock |   .2306785   .0509161     4.53   0.000     .1241098    .3372472
       _cons |  -42.71437   12.65602    -3.38   0.003    -69.20373   -16.22501
------------------------------------------------------------------------------

Both methods seem to paint the same picture in this example.

Comment

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#9

20 Dec 2018, 17:18

Originally posted by Andrew Musau View Post

Even if we accept that one can apply Newey to panel data, my question would be how do you account for the possibility of different optimal lag lengths (since you have independent cross-sections and multiple time series)?

Optimal lag selection is a small and marginal sub-field of HAC (heteroskedasticity and autocorrelation consistent) variance estimation. I am not a specialist in this field, so without reading again the literature I cannot tell you from the top of my head how to program optimal lag selector even for a single time series. (-ivregress- knows how to choose lags optimally for a single time series).

This is not that important because the HAC variances are valid if the lag you have selected grows with the sample size at "some suitable rate". There are a couple of rules of thumb cites in various places by Stock and Watson.

In this non-optimal lag growing at suitable rate with the sample size, the problem that you worry about does not arise at all for balanced panels, because the length of all the time series is the same.
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#10

21 Dec 2018, 04:27

I remember a phrase that I once heard from Jeff Wooldridge:
"If you assume more, you can gain more. If you assume less, you will lose less."

If you are confident that your cross sections are independent, you can gain efficiency by utilizing this assumption. On the other side, DK standard errors can provide you with robustness by guarding against a potential violation of this independence assumption.

The usual panel-clustered standard errors work well in small-T, large-N settings, but I guess we are talking here about the opposite situation: large T, small N.

https://www.kripfganz.de/stata/
1 like
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#11

21 Dec 2018, 07:45

I was going to reply to #2 above but wanted to experiment a bit at first . Haven't found the time.
Notice that the coefficient estimates obtained by using -newey ...., force- are identical to the coefficient obtained by using -reg-.
What led me to look at this was the error message obtained when using -newey- without the force option.
So I went and had a look a the Stata manual for xtreg and saw that the force option is documented there.
-newey- with the force option just ignores the panel structure of the data. But how does it handle the switch from one panel to another?
This is what I wanted to experiment with but have found the time to do so.
Therefore, I agree with Joao Santos Silva: why not used clustered errors? I add, at least "until one is sure of what exactly newey combined with force is actually doing.
1 like
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2594
#12

21 Dec 2018, 09:02

Eric: I was wondering about the same thing.
There is also the community-contributed newey2 command on SSC that explicitly supports panel data and yields identical results to newey, force.

https://www.kripfganz.de/stata/
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10195

#13

21 Dec 2018, 09:43

It is actually simple what newey with the force option does. It creates a weakly balanced panel where the cross-sections are distinct but the time variable is continuous. What you need for newey to work is a continuous time variable.

Code:

. webuse grunfeld

. xtset company time
       panel variable:  company (strongly balanced)
        time variable:  time, 1 to 20
                delta:  1 unit

. newey invest mvalue kstock, lag(3) force

Regression with Newey-West standard errors      Number of obs     =        200
maximum lag: 3                                  F(  2,       197) =      79.67
                                                Prob > F          =     0.0000

------------------------------------------------------------------------------
             |             Newey-West
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1155622   .0107291    10.77   0.000     .0944035    .1367209
      kstock |   .2306785   .0662771     3.48   0.001     .0999749    .3613821
       _cons |  -42.71437   15.73997    -2.71   0.007    -73.75483    -11.6739
------------------------------------------------------------------------------


 sort company year

. gen t=_n

. xtset company t
       panel variable:  company (weakly balanced)
        time variable:  t, 1 to 200
                delta:  1 unit

. newey invest mvalue kstock, lag(3)

Regression with Newey-West standard errors      Number of obs     =        200
maximum lag: 3                                  F(  2,       197) =      79.67
                                                Prob > F          =     0.0000

------------------------------------------------------------------------------
             |             Newey-West
      invest |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      mvalue |   .1155622   .0107291    10.77   0.000     .0944035    .1367209
      kstock |   .2306785   .0662771     3.48   0.001     .0999749    .3613821
       _cons |  -42.71437   15.73997    -2.71   0.007    -73.75483    -11.6739
------------------------------------------------------------------------------

Thanks Joro for your reply to my question in #9. I found the following discussion on Cross Validated (reply by Candamir) that provides some references. The key is that your panel needs to be balanced for this approach to make sense (or at least not too unbalanced).

Comment

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#14

21 Dec 2018, 10:51

Andrew, the Cross Validated discussion, in particular Candamir, are saying exactly what I told you.

I dont quite agree with the bombastic language that Achim Zeileis uses ("A proper nonparametric lag selection procedure is introduced Newey & West (1994, Review of Economic Studies).") but yes, the references he lists are the important references in the literature. What Achim calls "proper" is optimal lag selection, I need to read the paper again to see what exactly they are optimising there, but they are optimising something, and this is how they are determining the optimal lag length.

More importantly I disagree with the proper language of Achim because it falsely implies that there is something improper in not using optimal lag length, and this is simply not true.

As I mentioned before, as long as the lag length grows with the sample size at some "suitable rate", discussed in the original Newey West paper., the HAC estimation works perfectly fine.

The rule that I usually use is from Stock and Watson "Introduction to Econometrics"

(eq. 15.17) m = 0.75*T^(1/3).

Finally of course I agree with you that if the panels are of vastly different length, this will cause conceptual problems. Do we set the rule above calculated for the shortest, or the longest panel? It is not clear, and one can argue both ways.
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10195
#15

21 Dec 2018, 10:57

Perfect! Thanks very much.
Comment

Announcement