Dear all,
I have a question that has been touched upon in other topics but so far hasn't been answered to my full satisfaction.
I have two regression stages where I use estimated coefficients from the first stage as dependent variables in the second. More precisely, I estimate seperately for N countries the following time series as the first stage:
Yt = a0 + b1 Xt1+ b2,...KXt2,...K + ut
where b1 is the coefficient of (main) interest while the other Xs are the control variables (to give some background: Xt1 is a proxy for bank risks while Yt represents sovereign risks. I want to investigate the transmission of bank risks to sovereign risks). This first stage is estimated with daily data, seperately for every country (no panel dimension yet) for every quarter in the time span (which yields around 60 observations in every quarter and is done repeatedly, quarter after quarter).
In a second stage, I collect the b1s which vary over quarters and across countries and use them as dependent variables in a panel fixed-effects estimation with a quarterly frequency. Basically, I want to investigate why the transmission from bank to sovereign risks is stronger/weaker over various quarters and across countries while controlling for macro measures such as GDP growth that only have a quarterly frequency. The model looks like this:
bn,t = c0 + d1,...LZn,t1,...L + Countryn + en,t
with Countryn being the fixed effects term (keep in mind that the first stage is daily (Xs are interest rates, stock prices and the like), while the second stage has a quarterly frequency).
As you probably know, using estimated values from a previous stage as a dependent variable in a second stage introduces biased standard errors because the second step ignores the estimaton error from the first stage. I would like to ask about the most effective way to remedy this problem. Some suggestions have been proposed:
- This Stata blog entry (http://blog.stata.com/2014/12/08/usi...tion-problems/) uses GMM to estimate both stages simultaneously, thus avoiding the two-stage-error problem. However, I don't believe I can apply this method because (aside from not using a GMM model) I have different time frequencies (daily and quarterly) and different models (time series and panel).
- Some authors suggest switching to an FGLS estimator in the second stage which assumes heteroskedastic and cross-correlated error terms (xtgls with the panels(correlated) option). This sounds reasonable as a first step to me because the errors of my second stage are definitely cross-correlated as the Pesaran test statistics shows. But does it really fix the biased standard errors specific to the two-stage problem?
- Lewis and Linzer (2005) (http://www.sscnet.ucla.edu/polisci/f...ewisLinzer.pdf) argue to use an FGLS estimator which uses the variance-covariance matrix from the standard errors of the first stage to generate a weight, which can be used to adjust the second stage regression (basically a weighted least squares regression with country-time specific weights). The weights (adjusted for a panel dimension) are calculated as follows:
wi,t = 1 / [sqrt(se_DependVariablei,t + SigmaHati,t)]
where se_DependVariablei,t are the standard errors of the dependent variable from the first stage and SigmaHati,t is an estimate for the variance of the error term in the second stage that is not due to the sampling error of the dependent variable. This sounds reasonable to me and I have seen some papers claiming that they use this adjustment. However, is there a Stata command which does this adjustment automatically or do the weights have to be calculated step by step? If there is no command, can I show you my code how I have done it? (I don't feel too confident when it comes to formulating and calculating matrices in Stata).
So maybe to sum up my questions:
1) Do you know of a method that fixes the two-stage-error most efficiently (irrespective of the suggestions I listed above)? If there is an established way to solve this problem (maybe with a ready Stata command) I would be most happy to know about it!
2) How do you judge the suggestions I posted? Do you think FGLS will suffice? And do you know of a Stata command that implements the Lewis, Linzer weights and if not, can I show you my code on how I did it?
Any help is greatly appreciated!
I have a question that has been touched upon in other topics but so far hasn't been answered to my full satisfaction.
I have two regression stages where I use estimated coefficients from the first stage as dependent variables in the second. More precisely, I estimate seperately for N countries the following time series as the first stage:
Yt = a0 + b1 Xt1+ b2,...KXt2,...K + ut
where b1 is the coefficient of (main) interest while the other Xs are the control variables (to give some background: Xt1 is a proxy for bank risks while Yt represents sovereign risks. I want to investigate the transmission of bank risks to sovereign risks). This first stage is estimated with daily data, seperately for every country (no panel dimension yet) for every quarter in the time span (which yields around 60 observations in every quarter and is done repeatedly, quarter after quarter).
In a second stage, I collect the b1s which vary over quarters and across countries and use them as dependent variables in a panel fixed-effects estimation with a quarterly frequency. Basically, I want to investigate why the transmission from bank to sovereign risks is stronger/weaker over various quarters and across countries while controlling for macro measures such as GDP growth that only have a quarterly frequency. The model looks like this:
bn,t = c0 + d1,...LZn,t1,...L + Countryn + en,t
with Countryn being the fixed effects term (keep in mind that the first stage is daily (Xs are interest rates, stock prices and the like), while the second stage has a quarterly frequency).
As you probably know, using estimated values from a previous stage as a dependent variable in a second stage introduces biased standard errors because the second step ignores the estimaton error from the first stage. I would like to ask about the most effective way to remedy this problem. Some suggestions have been proposed:
- This Stata blog entry (http://blog.stata.com/2014/12/08/usi...tion-problems/) uses GMM to estimate both stages simultaneously, thus avoiding the two-stage-error problem. However, I don't believe I can apply this method because (aside from not using a GMM model) I have different time frequencies (daily and quarterly) and different models (time series and panel).
- Some authors suggest switching to an FGLS estimator in the second stage which assumes heteroskedastic and cross-correlated error terms (xtgls with the panels(correlated) option). This sounds reasonable as a first step to me because the errors of my second stage are definitely cross-correlated as the Pesaran test statistics shows. But does it really fix the biased standard errors specific to the two-stage problem?
- Lewis and Linzer (2005) (http://www.sscnet.ucla.edu/polisci/f...ewisLinzer.pdf) argue to use an FGLS estimator which uses the variance-covariance matrix from the standard errors of the first stage to generate a weight, which can be used to adjust the second stage regression (basically a weighted least squares regression with country-time specific weights). The weights (adjusted for a panel dimension) are calculated as follows:
wi,t = 1 / [sqrt(se_DependVariablei,t + SigmaHati,t)]
where se_DependVariablei,t are the standard errors of the dependent variable from the first stage and SigmaHati,t is an estimate for the variance of the error term in the second stage that is not due to the sampling error of the dependent variable. This sounds reasonable to me and I have seen some papers claiming that they use this adjustment. However, is there a Stata command which does this adjustment automatically or do the weights have to be calculated step by step? If there is no command, can I show you my code how I have done it? (I don't feel too confident when it comes to formulating and calculating matrices in Stata).
So maybe to sum up my questions:
1) Do you know of a method that fixes the two-stage-error most efficiently (irrespective of the suggestions I listed above)? If there is an established way to solve this problem (maybe with a ready Stata command) I would be most happy to know about it!
2) How do you judge the suggestions I posted? Do you think FGLS will suffice? And do you know of a Stata command that implements the Lewis, Linzer weights and if not, can I show you my code on how I did it?
Any help is greatly appreciated!
Comment