Panel data with large T and small N

Edib Smolo

Join Date: Apr 2019

Posts: 13
#1

Panel data with large T and small N

02 Feb 2021, 13:20

Dear All,

I am trying to do a study on the Western Balkan countries and I am not sure which is the best estimation method to be used. I am faced with N=5 and T=20. I can decrease the time period, but I would like to retain it if possible. As far as I am aware, the bias-corrected LSDV estimator (LSDVC) estimator is the one that is suited for this sort of the panel data.

I also tried to use RE and FE methods, but after running those estimations and using the Hausman test I get a negative number chi2 with this message:

chi2<0 ==> model fitted on these
data fails to meet the asymptotic
assumptions of the Hausman test;
see suest for a generalized test

Is there any other option for me to do this investigation or is the LSDVC good enough?

Appreciate anyone's helps
Tags: None

1 like
Joro Kolev

Join Date: Aug 2018

Posts: 3050
#2

02 Feb 2021, 22:24

If we assume that T=20 is "large", check the -xtgls- command, and you can include also dummies for the N entities (countries?).
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#3

03 Feb 2021, 03:02

Originally posted by Joro Kolev View Post

If we assume that T=20 is "large", check the -xtgls- command, and you can include also dummies for the N entities (countries?).

Der Joro,

Thanks for the answer. Last night I was looking at this option. But, tell me, would LSDVC method suffice as well?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#4

03 Feb 2021, 03:09

Edib:
with T>N panel datasets the main issue rests on serial correlation and, possibly, across panels correlation as well.
As fara as I know,, the LSDVC is usually an option for dynamic panel datasets (as per yopur decription, I assume you're dealing with a static one).

Kind regards,
Carlo
(Stata 19.0)
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#5

03 Feb 2021, 03:17

Dear Carlo,
Thanks for the answer. As I understood from you, the LSDVC is not a suitable option. Hence, what would you suggest? Would xtgls command be a better option as suggested by Joro Kolev?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#6

03 Feb 2021, 03:27

Edib:
yes, Joro's advice is on target here.
I would also consider -xtregar-.

Kind regards,
Carlo
(Stata 19.0)
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#7

03 Feb 2021, 03:34

Dear Carlo and Joro,
Thanks a lot for your advice. I will explore these options. Appreciate it.
Regards,
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#8

03 Feb 2021, 04:04

Dear Carlo and Joro,
One more question on this. I want to investigate the interaction term (in this case between FDI and institutional development), so is it OK to use it under these two suggested options, i.e. -xtregar- and -xtgls- commands?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#9

03 Feb 2021, 04:18

Edib:
yes, it is.
See -fvvarlist- notation on how to code interactions (and categorical variables) efficiently.

Kind regards,
Carlo
(Stata 19.0)
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#10

03 Feb 2021, 10:06

Once again, thank you so much, Carlo!
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#11

05 Feb 2021, 15:36

Dear All,
I have seen some paper that use FE and RE and they report R-square and Adj-R-squares as well as F-statistic and its p-values for both models. As far as I know, R-square is not reported in any of those. Adj-R-squares as well as F-statistic and its p-values are reported under FE but not under RE.

Now, I am confused, how do they report these info if they are not available? Or am I missing something?

Regards,
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17711
#12

06 Feb 2021, 02:26

Edib:
- both xtreg,fe- and -xtreg,re- reports different R-square: that said, under -fe- (-re-) specification the within Rsq (between R-sq) is what to look at;
- the adjusted R_sq is available from -xtreg,fe- only: you can acess it just typing:

Code:

. display e(r2_a)

after -xtreg,fe-.

Hence, I do not think that you missed out on anything.
That said, this link can be interesting: https://www.stata.com/support/faqs/s...distributions/

Kind regards,
Carlo
(Stata 19.0)
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#13

06 Feb 2021, 09:06

Thank you very much, Carlo! This is excellent. Exactly what I was looking for. I appreciate it.
Regards
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2168
#14

06 Feb 2021, 13:56

Edib: I'll add a few things. First, T = 20 is not "large T." Just because T > N doesn't mean methods that assume large T are appropriate. In fact, you have small N and small T. It's true that "large T" methods are more appropriate than "large N" methods, but you must exercise caution. Almost all panel data methods are justified by asymptotic approximations, and they are likely to be poor in your case if they are based on large T and absolutely terrible if based on large N.

I would avoid in GLS method as those are justified only when one can trust asymptotics. Plus, it is not obvious you can include country and year fixed effects.

I would still with two-way fixed effects. You cannot cluster with N = 5 but you can try the Driscoll-Kraay standard errors based on large T. You need to install the user-written command -xtscc-. I've used it with T around 25, and that's probably pushing it. In your case, I would use one lag, maybe two, in the Newey-West truncation.

Generically, you have to create the time dummies and add them. I don't believe factor notation is supported.

Code:

xtset id xtscc y x1 ... xK d2 ... d20, fe lag(1)
1 like
Comment
Edib Smolo

Join Date: Apr 2019

Posts: 13
#15

06 Feb 2021, 14:47

Originally posted by Jeff Wooldridge View Post

Edib: I'll add a few things. First, T = 20 is not "large T." Just because T > N doesn't mean methods that assume large T are appropriate. In fact, you have small N and small T. It's true that "large T" methods are more appropriate than "large N" methods, but you must exercise caution. Almost all panel data methods are justified by asymptotic approximations, and they are likely to be poor in your case if they are based on large T and absolutely terrible if based on large N.

I would avoid in GLS method as those are justified only when one can trust asymptotics. Plus, it is not obvious you can include country and year fixed effects.

I would still with two-way fixed effects. You cannot cluster with N = 5 but you can try the Driscoll-Kraay standard errors based on large T. You need to install the user-written command -xtscc-. I've used it with T around 25, and that's probably pushing it. In your case, I would use one lag, maybe two, in the Newey-West truncation.

Generically, you have to create the time dummies and add them. I don't believe factor notation is supported.

Code:

xtset id xtscc y x1 ... xK d2 ... d20, fe lag(1)

Dear Prof. Wooldridge,
Thank you very much for your kind response. I will explore this option as well. I tried previously suggested options, but I will explore this one suggested by you as well. Thank you very much. I would appreciate, however, if you could direct me to articles that use this method so that I can learn about it and explain it. Perhaps, you can share that paper of your with me. I would appreciate your help. I can also share my email with you if you are OK with that.

Finally, I am not sure if I understood your correctly, can I use the interaction term or not? If I understood your correctly, I cannot use factor notation, but can I use the interaction term if I create if before estimation using -gen- command? I am interested to see the effect of FDI on economic growth first and then to see how institutions (interacting with FDI) are affecting this FDI-Growth nexus.

Looking forward to hearing from you soon.

Regards,

Edib Smolo
Comment

Announcement

Panel data with large T and small N

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment