Is there a Test for Dynamic Panel Model under Large N and Large T?

Joao Pancada

Join Date: Aug 2017

Posts: 14
#1

Is there a Test for Dynamic Panel Model under Large N and Large T?

03 Sep 2017, 12:04

Dear Statalist users,

My sample with accounting and stock return data has large N = 745 and large T = 218, but the panel is unbalanced. I need to perform two regressions with different dependent variables.

Using the robust Hausman test, one regression requires fixed effects while the other can be done using random effects. In either case, I suspect I should have a dynamic rather than static panel regression after using xtserial on each variable alone and xtreg with lagged dependent and other indepvars.

However, I cannot find anywhere how to really test under my sample conditions whether to use dynamic vs static. For example, the famous Arellano-Bond test through xtabond does not work because it assumes small T. Any thoughts?

A starting point could be to calculate both models and simply compare the estimates' magnitude. However, xtdcce2 has one problem as I flagged in this thread [here] and I couldn't find another command that works for my sample. Do you have any suggestion here as well?

Thank you very much!
Best, João
Tags: Arellano-Bond, dynamic panel model, panel data, test panel model, xtdcce2
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#2

04 Sep 2017, 06:57

What test exactly are you looking for? If it's a serial correlation test, you can use xtqptest/xthrtest/xtistest on ssc.
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#3

04 Sep 2017, 11:58

Hi Jesse,

I am looking for a test that can tell me, given my sample characteristics, whether to use a dynamic panel model or a static model. Perhaps the best way to do it is to test for endogeneity since exogeneity is automatically violated in dynamic panel models. I don't think it is enough to show that my dependent variable is autocorrelated using a serial correlation test.

Furthermore, if I have to use a dynamic model, I am also looking for one model that works for my sample (Large N and T) and it is already coded in Stata.

Those are the 2 issues I am not being able to solve.

Thanks.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2785
#4

04 Sep 2017, 13:10

Dear Joao,

I would say that the choice between a dynamic and a static model is a modeling decision that only you can make because the choice very much depends on the questions you want to answer. So, a statistical test may not be particularly useful in this context.

Best regards,

Joao
3 likes
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#5

04 Sep 2017, 15:05

Dear João,

Thank you for your reply. I asked that test question because I was intrigued by the following: “If the dependent variable follows a simple AR (1) process, the simplest of dynamic models, the standard within estimator is biased because strict exogeneity fails.” (Peijie Wang, 2009, Financial Econometrics book). But I understand your point and now I will look for endogeneity tests in panel data (perhaps the Durbin–Wu–Hausman test). I am not interested in the lagged effect of the dependent variable, I just don't want it to "cause problems" in my regression since it is omitted.

In addition to an endogeneity test, I was thinking about simply estimating a dynamic model and see if the results are similar just like Giannetti and Wang (2016) paper from The Journal of Finance. However, I am not able to find on Stata a dynamic model that is theoretically correct given my sample. Any idea?

Best regards
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4838
#6

05 Sep 2017, 08:23

Since N is large and T is large, I believe you could use xtreg with lagged Y as an independent variable. With T = 218 the bias wouldn't be much, especially if the autoregressive parameter isn't that big. The t value for lagged Y would be an empirical test of whether it should be in the model.

if I am wrong about this, please correct me!

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#7

05 Sep 2017, 12:41

Dear Richard,

I think I understand why you say the bias shouldn't be high when T is large. Wooldridge in his 2009 book (econometric analysis of CS and Panel data) says that if Y and X are weekly dependent and T is large, then the bias arising from using fixed effects (or xtreg) when strict exogeneity fails is minimal (it decreases with T). Are you referring to something along those lines?

I ran the regression as you mentioned, and the parameter is low (~0.15) but the t-stat is very large even after clustering (t= 17). As a matter of fact, the lagged dependence goes all the way to the 3rd lag (parameter is 0.03 and t-stat = 4). So I guess I should include them. I haven't run all regressions but, except for one or two regressors, I get very similar conclusions in terms of magnitude and significance.

However, now I have doubt. I am inclined to use the Common Correlated Effects model (CCE) from Pesaran (2006) rather than xtreg because my data has a lot of cross-sectional dependence. I am not sure if what Wooldridge says also applies to this model, even if fixed effects can be shown to be a special case of CCE. I guess it should but the bias derivation would at least be different.

Last edited by Joao Pancada; 05 Sep 2017, 12:43.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4838
#8

05 Sep 2017, 13:34

My sometimes co-author Enrique Moral-Benito sent me this:

The reference could be Arellano's book (Panel data econometrics, 2003, Oxford University Press) pages 85-86.

The point is that the Nickell bias (autorregressive parameter in dynamic panel data models with fixed effects) vanishes as T increases but it may be large in short panels. Indeed, there is an exact formula (for a model without further control variables) and Table 6.1 in page 86 shows the magnitude of the bias for different values of T and the autorregressive parameter.

Wooldridge sounds similar. I don't know if CCE would be better or not.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
Stata Version: 17.0 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#9

05 Sep 2017, 14:34

Exactly, the Nickell bias, I had it on my notes, from Alvarez and Arellano (2003) paper in Econometrica. They also mention that the bias would only disappear when N/T -> 0, which is not my case. Nonetheless, thank you for the reference, I'll check it out.

Last edited by Joao Pancada; 05 Sep 2017, 14:37.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2785
#10

05 Sep 2017, 15:33

Dear All,

Let me go back to my earlier comment and give an example. Suppose you want to model migration between pairs of countries as a function of distance, colonial links, etc. If we estimate a dynamic model we may find that the lagged flow is very significant and distance and other variables have a small effect or no effect at all. However, a static model may reveal that distance and other variables are important factors. The dynamic model is certainly better for prediction, but is somewhat tautological in the sense that is tells us that migration this year will be high for countries that had strong migration in the previous year, but we cannot explain what generated the strong migration in the past; for that a static model may be more useful. So, to my mind the choice between a static and a dynamic model has more to do with the question we are asking than with the characteristics of the data.

Best wishes,

Joao
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#11

05 Sep 2017, 16:10

Dear Joao,

I understand your point and after your first post I realized I was asking the wrong question. I am not interested in the lagged values of Y and so on having a dynamic model. I simply suspect that my static model violates the strict exogeneity assumption because lagged values of Y, which are not included in the regression but greatly affect Y, are likely to be correlated with the regressors. Therefore, what I really need is to test for endogeneity / omitted variable bias. Then, if the assumption fails, I have 2 options: 1) ignore the problem because T is large and the bias is "small"; 2) take that into account and modify my model, which I don't know how and that's why I was talking of maybe using a dynamic model.

Given this clarification, can you recommend a course of action?

Thanks.
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 2785
#12

06 Sep 2017, 00:58

Joao,

I do not really follow your argument, but I suggest that you read Wooldridge's book carefully. He describes a test for strict exogeneity and that may be useful in your context.

Best wishes,

Joao
Comment
Sebastian Kripfganz

Join Date: May 2014

Posts: 2443
#13

06 Sep 2017, 04:18

I agree with Joao that the research question should guide you towards the modelling choice in the first place. But there is also an econometric argument against an incautious use of the static model: If both the dependent variable and the regressors are highly persistent, the regression coefficients in a static model might just pick up a common trend among these variables without there being an underlying (causal) relationship. We know this as the spurious regression problem in the time series context but the problem also applies to panel data, in particular if T is relatively large. A dynamic model specification can guard you against such spurious correlation.

With regard to the CCE estimator, the underlying arguments do not fundamentally change. The bias still vanishes with increasing T, although the formula will look different.

https://twitter.com/Kripfganz
2 likes
Comment
Joao Pancada

Join Date: Aug 2017

Posts: 14
#14

06 Sep 2017, 06:03

Thanks for the reply Sebastian. Yes, I believe I've some persistent regressors because my Y variable is on a higher frequency (weekly) compared to number of analysts and advertising (quarterly and annually, respectively). From what I see on the literature on spurious regressions, that problem would occur if variables are I(1) or have "long memory" (i.e. I(0) but still highly persistent).

Let me just confirm I understand the econometrics here. Is the problem of having highly persistent variables one example of the omitted variable bias problem? It seems so from your description of the consequences of having a static model. As I mentioned in previous posts, intuition tells me that lag of Y affects Y and is correlated with some of the regressors. Therefore, I would like to include the lag of Y in the model simply to avoid the omitted problem and not because of my research question. Can you advise me on a dynamic model that works under large N and T?

Last edited by Joao Pancada; 06 Sep 2017, 06:12.
Comment

Announcement

Is there a Test for Dynamic Panel Model under Large N and Large T?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment