Hi,
I am trying to understand when it makes sense to use the cluster option when running a pooled OLS regression on panel data.
My dataset consists of panel data containing observations of >100 firms over a number of years. I want to replicate the methodology used in a paper where pooled OLS is used to estimate a univariate prediction model for a within sample prediction analysis.
I understand that using the cluster options helps take into account the correlation between observations within a panel ID, otherwise the panel structure of the data is ignored. However, does it make sense to use this option for the purpose of estimating a univariate prediction model?
In other words, would it make sense to use:
- reg y x
or
- reg y x, cluster(Panel_ID)
keeping in mind that I am dealing with panel data?
Many thanks, Ali
I am trying to understand when it makes sense to use the cluster option when running a pooled OLS regression on panel data.
My dataset consists of panel data containing observations of >100 firms over a number of years. I want to replicate the methodology used in a paper where pooled OLS is used to estimate a univariate prediction model for a within sample prediction analysis.
I understand that using the cluster options helps take into account the correlation between observations within a panel ID, otherwise the panel structure of the data is ignored. However, does it make sense to use this option for the purpose of estimating a univariate prediction model?
In other words, would it make sense to use:
- reg y x
or
- reg y x, cluster(Panel_ID)
keeping in mind that I am dealing with panel data?
Many thanks, Ali
Comment