Pooled OLS panel data cluster option

Ali Malik

Join Date: Jul 2018

Posts: 23
#1

Pooled OLS panel data cluster option

01 Aug 2018, 15:31

Hi,

I am trying to understand when it makes sense to use the cluster option when running a pooled OLS regression on panel data.

My dataset consists of panel data containing observations of >100 firms over a number of years. I want to replicate the methodology used in a paper where pooled OLS is used to estimate a univariate prediction model for a within sample prediction analysis.

I understand that using the cluster options helps take into account the correlation between observations within a panel ID, otherwise the panel structure of the data is ignored. However, does it make sense to use this option for the purpose of estimating a univariate prediction model?

In other words, would it make sense to use:
- reg y x
or
- reg y x, cluster(Panel_ID)
keeping in mind that I am dealing with panel data?

Many thanks, Ali
Tags: Cluster option, panel data, Pooled OLS, regression
Clyde Schechter

Join Date: Apr 2014

Posts: 30065
#2

01 Aug 2018, 15:48

Unless you have good reason to believe that the error terms are independent within the panels, or if the number of panels is too small, then you should use the -cluster()- option. There is no consensus on how small a number of panels is too small. You refer to 100 "firms" in your data, without saying whether the firms are the panels, or not. But if you have 100 panels, I think everyone would agree that is sufficiently many for the validity of cluster robust standard errors..
Comment

Announcement

Pooled OLS panel data cluster option

Comment