Dear Forum Members,
I have fiddled with a couple of models, aimed at tackling heteroscedasticity.
That said, I'm still in doubt about the best method.
In short, in general, the best parameters (AIC/BIC/R2/SEs) were found in the robust model, which, by the way, seems to have, among the pros, simplicity of command and objetivity (no need to decide how to estimates weights).
My questions are:
Why 10 observations were excluded from VWLS - model 1 the estimations?
Is it correct to use the SD formula as well as the predict stdf so as to get the "true" SD for the VWSL model?
Is it correct, the approach to curb heteroscedasticy by "selecting" different weights (WSL-2 model) up to get an "appropriate" hettest?
Which model should eventually be taken as "less biased", considering the OLS presented heteroscedasticity? Is this a case of "much ado about nothing", I mean, shouldn't robust regression become the standard approach in such cases?
Thanks in advance.
I have fiddled with a couple of models, aimed at tackling heteroscedasticity.
That said, I'm still in doubt about the best method.
Code:
. sysuse auto (1978 Automobile Data) . quiet regress mpg turn foreign . estimates store OLS . predict standerror, stdf . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of mpg chi2(1) = 13.12 Prob > chi2 = 0.0003 . */ according to the test above, there is heteroscedasticity . */ rfvplot . */ also, by visual analysis . */ now, robust estimations . quiet regress mpg turn foreign, vce(robust) . estimates store Robust . */ then wls regression . */ variable "turn" seems to be the "culprit", . */ I created tentative weights, until reach an "appropriate" homoscedasticity . gen invturn1= (1/turn)^3 . quiet reg mpg turn foreign [weight=invturn1] . estimates store WLS1 . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of mpg chi2(1) = 8.90 Prob > chi2 = 0.0028 . gen invturn2= (1/turn)^6 . quiet reg mpg turn foreign [weight=invturn2] . estimates store WLS2 . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of mpg chi2(1) = 5.14 Prob > chi2 = 0.0234 . gen invturn3= (1/turn)^10 . quiet reg mpg turn foreign [weight=invturn3] . estimates store WLS3 . estat hettest Breusch-Pagan / Cook-Weisberg test for heteroskedasticity Ho: Constant variance Variables: fitted values of mpg chi2(1) = 1.78 Prob > chi2 = 0.1825 . */ now, the vwls regression . */ first option, taking the all predictors as categorical variables . quiet vwls mpg turn foreign . estimates store VWLS1 . */ second option, taking them as continuous variables . */ we have one continous ( in fact, discrete, ranging 3-51) and one categorical variable... . */ we need to get the conditional standard deviation of depvar . */ I decided to use the SD formula on the standard error of the forecast . */ SD = SE*sqrt(n). Hence . gen mysd = standerror*sqrt(74) . quiet vwls mpg turn foreign, sd( mysd) . estimates store VWLS2 . */ now, the table with the coefficients . estimates table OLS Robust WLS? VWLS?, b(%7.4f) se(%7.4f) stats(N r2 r2_a aic bic) ------------------------------------------------------------------------------------ Variable | OLS Robust WLS1 WLS2 WLS3 VWLS1 VWLS2 -------------+---------------------------------------------------------------------- turn | -1.0292 -1.0292 -1.0040 -0.9432 -0.8311 -0.8775 -1.0407 | 0.1389 0.1454 0.1495 0.1616 0.1839 0.0920 1.2410 foreign | -1.2636 -1.2636 -0.6496 0.1936 1.4659 -0.4163 -1.3407 | 1.3280 1.6963 1.2928 1.2567 1.2148 1.0868 11.7757 _cons | 62.4796 62.4796 61.2938 58.6434 53.9279 55.8830 62.9635 | 5.7843 6.2346 6.0423 6.2918 6.7699 3.9388 51.7039 -------------+---------------------------------------------------------------------- N | 74 74 74 74 74 64 74 r2 | 0.5233 0.5233 0.4645 0.3858 0.2729 r2_a | 0.5099 0.5099 0.4494 0.3685 0.2524 aic | 419.9633 419.9633 434.0915 444.3773 451.7722 . . bic | 426.8755 426.8755 441.0037 451.2895 458.6844 . . ------------------------------------------------------------------------------------ legend: b/se . */ we see the VWLS don't provide R-squared. Besides, the model 1, . */ which took all predictors, "cleaned up" 10 observations. . */ yet, the predictions for xb as well as se are available for 74 observations . predict myprediction, xb . predict myprediction2, stdp
My questions are:
Why 10 observations were excluded from VWLS - model 1 the estimations?
Is it correct to use the SD formula as well as the predict stdf so as to get the "true" SD for the VWSL model?
Is it correct, the approach to curb heteroscedasticy by "selecting" different weights (WSL-2 model) up to get an "appropriate" hettest?
Which model should eventually be taken as "less biased", considering the OLS presented heteroscedasticity? Is this a case of "much ado about nothing", I mean, shouldn't robust regression become the standard approach in such cases?
Thanks in advance.