VWLS, WLS, robust and OLS regressions - which one tackles best heteroscedasticity?

Marcos Almeida

Join Date: Apr 2014
Posts: 4047

VWLS, WLS, robust and OLS regressions - which one tackles best heteroscedasticity?

04 Sep 2017, 04:38

Dear Forum Members,

I have fiddled with a couple of models, aimed at tackling heteroscedasticity.

That said, I'm still in doubt about the best method.

Code:

 

. sysuse auto
(1978 Automobile Data)

. quiet regress mpg turn foreign

. estimates store OLS

. predict standerror, stdf

. estat hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
         Ho: Constant variance
         Variables: fitted values of mpg

         chi2(1)      =    13.12
         Prob > chi2  =   0.0003

. */ according to the test above, there is heteroscedasticity
. */ rfvplot
. */ also, by visual analysis
. */ now, robust estimations
. quiet regress mpg turn foreign, vce(robust)

. estimates store Robust

. */ then wls regression
. */ variable "turn" seems to be the "culprit",
. */ I created tentative weights, until reach an "appropriate" homoscedasticity
. gen invturn1= (1/turn)^3

. quiet reg mpg turn foreign [weight=invturn1]

. estimates store WLS1

. estat hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
         Ho: Constant variance
         Variables: fitted values of mpg

         chi2(1)      =     8.90
         Prob > chi2  =   0.0028

. gen invturn2= (1/turn)^6

. quiet reg mpg turn foreign [weight=invturn2]

. estimates store WLS2

. estat hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
         Ho: Constant variance
         Variables: fitted values of mpg

         chi2(1)      =     5.14
         Prob > chi2  =   0.0234

. gen invturn3= (1/turn)^10

. quiet reg mpg turn foreign [weight=invturn3]

. estimates store WLS3

. estat hettest

Breusch-Pagan / Cook-Weisberg test for heteroskedasticity
         Ho: Constant variance
         Variables: fitted values of mpg

         chi2(1)      =     1.78
         Prob > chi2  =   0.1825

. */ now, the vwls regression
. */ first option, taking the all predictors as categorical variables
. quiet vwls mpg turn foreign

. estimates store VWLS1

. */ second option, taking them as continuous variables
. */ we have one continous ( in fact, discrete, ranging 3-51) and one categorical variable...
. */ we need to get the conditional standard deviation of depvar
. */ I decided to use the SD formula on the standard error of the forecast
. */ SD = SE*sqrt(n). Hence
. gen mysd = standerror*sqrt(74)

. quiet vwls mpg turn foreign, sd( mysd)

. estimates store VWLS2

. */ now, the table with the coefficients
. estimates table OLS Robust WLS? VWLS?, b(%7.4f) se(%7.4f) stats(N r2 r2_a aic bic)

------------------------------------------------------------------------------------
    Variable |   OLS     Robust     WLS1      WLS2      WLS3      VWLS1     VWLS2  
-------------+----------------------------------------------------------------------
        turn | -1.0292   -1.0292   -1.0040   -0.9432   -0.8311   -0.8775   -1.0407  
             |  0.1389    0.1454    0.1495    0.1616    0.1839    0.0920    1.2410  
     foreign | -1.2636   -1.2636   -0.6496    0.1936    1.4659   -0.4163   -1.3407  
             |  1.3280    1.6963    1.2928    1.2567    1.2148    1.0868   11.7757  
       _cons | 62.4796   62.4796   61.2938   58.6434   53.9279   55.8830   62.9635  
             |  5.7843    6.2346    6.0423    6.2918    6.7699    3.9388   51.7039  
-------------+----------------------------------------------------------------------
           N |      74        74        74        74        74        64        74  
          r2 |  0.5233    0.5233    0.4645    0.3858    0.2729                      
        r2_a |  0.5099    0.5099    0.4494    0.3685    0.2524                      
         aic | 419.9633   419.9633   434.0915   444.3773   451.7722         .         .  
         bic | 426.8755   426.8755   441.0037   451.2895   458.6844         .         .  
------------------------------------------------------------------------------------
                                                                        legend: b/se

. */ we see the VWLS don't provide R-squared. Besides, the model 1,
. */ which took all predictors, "cleaned up" 10 observations.
. */ yet, the predictions for xb as well as se are available for 74 observations
. predict myprediction, xb

. predict myprediction2, stdp

In short, in general, the best parameters (AIC/BIC/R2/SEs) were found in the robust model, which, by the way, seems to have, among the pros, simplicity of command and objetivity (no need to decide how to estimates weights).

My questions are:

Why 10 observations were excluded from VWLS - model 1 the estimations?

Is it correct to use the SD formula as well as the predict stdf so as to get the "true" SD for the VWSL model?

Is it correct, the approach to curb heteroscedasticy by "selecting" different weights (WSL-2 model) up to get an "appropriate" hettest?

Which model should eventually be taken as "less biased", considering the OLS presented heteroscedasticity? Is this a case of "much ado about nothing", I mean, shouldn't robust regression become the standard approach in such cases?

Thanks in advance.

Last edited by Marcos Almeida; 04 Sep 2017, 04:42.

Best regards,

Marcos

Tags: None

Announcement

VWLS, WLS, robust and OLS regressions - which one tackles best heteroscedasticity?