Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pooled OLS panel data cluster option

    Hi,

    I am trying to understand when it makes sense to use the cluster option when running a pooled OLS regression on panel data.

    My dataset consists of panel data containing observations of >100 firms over a number of years. I want to replicate the methodology used in a paper where pooled OLS is used to estimate a univariate prediction model for a within sample prediction analysis.

    I understand that using the cluster options helps take into account the correlation between observations within a panel ID, otherwise the panel structure of the data is ignored. However, does it make sense to use this option for the purpose of estimating a univariate prediction model?

    In other words, would it make sense to use:
    - reg y x
    or
    - reg y x, cluster(Panel_ID)
    keeping in mind that I am dealing with panel data?

    Many thanks, Ali

  • #2
    Unless you have good reason to believe that the error terms are independent within the panels, or if the number of panels is too small, then you should use the -cluster()- option. There is no consensus on how small a number of panels is too small. You refer to 100 "firms" in your data, without saying whether the firms are the panels, or not. But if you have 100 panels, I think everyone would agree that is sufficiently many for the validity of cluster robust standard errors..

    Comment

    Working...
    X