Dear Statalist users,
I have several questions regarding the best approach to estimate the following model. I am estimating the effect of temperature on the amount of loans provided to enterprises. Data are aggregated at the municipal level in a single country (there is no bilateral relationship enterprise-bank) in my dataset. Temperature data are split between a trend (slow-moving) and deviation from trend (iid) obtained in an earlier step by decomposing the observed temperature series at the municipal level with a Local Level model in state-space form. I also add control variables at the national level to capture the overall determinants of credit. Credit is trending upwards and its properties resemble GDP (fairly smooth, but with some inter-annual variation).
The dimensions of the dataset are: N=7040, T=35.
The abridged model is therefore:
I have already checked that the model requires FE (and I consider only id FE). I am, however, a bit confused as to the best estimator and variance estimator to use in my case.
In a nutshell, my question is: what would be the best estimator and variance estimator for a panel dataset with N=7040 and T=35 that controls for heteroscedasticity, serial correlation and potentially spatial dependence where the model includes a lagged dependent and some regressors at a higher degree of aggregation than the dependent?
I break down the big question in several:
Best regards,
Olivier.
I have several questions regarding the best approach to estimate the following model. I am estimating the effect of temperature on the amount of loans provided to enterprises. Data are aggregated at the municipal level in a single country (there is no bilateral relationship enterprise-bank) in my dataset. Temperature data are split between a trend (slow-moving) and deviation from trend (iid) obtained in an earlier step by decomposing the observed temperature series at the municipal level with a Local Level model in state-space form. I also add control variables at the national level to capture the overall determinants of credit. Credit is trending upwards and its properties resemble GDP (fairly smooth, but with some inter-annual variation).
The dimensions of the dataset are: N=7040, T=35.
The abridged model is therefore:
Code:
Credit_it = L1.Credit_it Temperature_trend_it Temperature_deviation_it L1.GDPnational_t
In a nutshell, my question is: what would be the best estimator and variance estimator for a panel dataset with N=7040 and T=35 that controls for heteroscedasticity, serial correlation and potentially spatial dependence where the model includes a lagged dependent and some regressors at a higher degree of aggregation than the dependent?
I break down the big question in several:
- According to what I have read, some estimators and variance estimators are better suited than others depending on how large or small N and T are. What is the rule of thumb in that context? I am fairly confident that N=7040 is large, but what about T=35?
- Accordingly, my reading seems to point that
would be the best overall choice because it controls for the heteroscedasticity and autocorrelation at the same time.Code:
xtreg credit l.credit temptrend tempsdeviation l.GDPnational,fe vce(cluster municipality)
As a side note, I have checked the value of the autocorrelation of residuals in each municipality with the following:- Estimate the model with
Code:
credit l.credit temptrend tempsdeviation l.GDPnational,fe vce(cluster municipality)
- Generate the residuals with
Code:
predict residuals, res
- For each municipality (panel id), obtain the autocorrelation coefficient via
Code:
regress residuals L.residuals
- Most of the residuals suffer from autocorrelation, and is more or less the same for all municipalities within the same province.
- Estimate the model with
- However, having detected autocorrelation, I now wonder whether I should take this into account at the estimation level. That is, should I use xtregar, xtgee?
- Does the fact that I use regressors at the national level (merely for control) introduce some kind of cross-sectional correlation in my model? If so, what would be the solution?
- If I have cross-sectional dependence, do you suggest I model this with a spatial model (at least in the error term) or that I apply Driscoll and Kraay standard errors via xtscc? The latter solution causes my standard errors to inflated substantially, but I am not sure of the structure in the errors, if any, the xtscc imposes/allows.
- Given the size of my dataset and the presence of a lagged dependent in the regressors, is the Nickell bias a potential problem? If so, should I use xtdpdbc? This method, however, requires serially-uncorrelated errors, which I do not have.
Best regards,
Olivier.

Comment