I'm working with a small unbalanced panel dataset (N=24, T=30, Obs.=590). My goal is a typical one... to test hypothesized relationships between Xs and Y. When using a FE or RE framework, I've noticed that my errors are serially correlated. As a result, I'm thinking about making my model dynamic by adding a lag of the dependent variable as a regressor. I'm aware that doing so creates an endogeneity issue, leading to biased estimates. From what I've read, this bias diminishes as T increases.
Q1) Is T=30 large enough to ignore the endogeneity bias issue? Or do I need to address it via instrumentation?
Q2) More generally when are T and N considered "large/small"?
Q3) Can/should time fixed effects be used in a dynamic model?
Q4) How can I decide whether a single lag of the dependent variable enough? My dataset may be too small to support multiple lags but I'd still like to know.
Perhaps a dynamic model isn't the way to go. Alternatively, I could use first differencing or a time polynomial to alleviate nonstationarity issues.
Q5) How do I know which combination of these modeling techniques is most appropriate (dynamic methods, first-differencing, detrending, inclusion/exclusion of year dummies)?
Q6) At least one message board I found suggested that stationarity isn't a major concern when using panel data. I can't imagine how this could be true, as I think nonstationarity would lead to spurious results. Am I correct or am I missing something?
I've been reading message boards, online lecture notes, and academic papers for days but can't find practical answers to these questions.
If you can address ANY of these questions, I would greatly appreciate it. When doing so, please bear in mind that I'm looking for practical approaches and don't have the ability to understand highly technical/theoretical papers. Thank you!
Q1) Is T=30 large enough to ignore the endogeneity bias issue? Or do I need to address it via instrumentation?
Q2) More generally when are T and N considered "large/small"?
Q3) Can/should time fixed effects be used in a dynamic model?
Q4) How can I decide whether a single lag of the dependent variable enough? My dataset may be too small to support multiple lags but I'd still like to know.
Perhaps a dynamic model isn't the way to go. Alternatively, I could use first differencing or a time polynomial to alleviate nonstationarity issues.
Q5) How do I know which combination of these modeling techniques is most appropriate (dynamic methods, first-differencing, detrending, inclusion/exclusion of year dummies)?
Q6) At least one message board I found suggested that stationarity isn't a major concern when using panel data. I can't imagine how this could be true, as I think nonstationarity would lead to spurious results. Am I correct or am I missing something?
I've been reading message boards, online lecture notes, and academic papers for days but can't find practical answers to these questions.
If you can address ANY of these questions, I would greatly appreciate it. When doing so, please bear in mind that I'm looking for practical approaches and don't have the ability to understand highly technical/theoretical papers. Thank you!
Comment