Heteroskedasticity

Bekhruz Nuraliev

Join Date: Dec 2023

Posts: 2
#1

Heteroskedasticity

09 Dec 2023, 04:42

Dear all
I am trying to eliminate the problem of Heteroskedasticity in my regression model. I have used Log, WLS, FGLS methods. None of the helped. So I used the robust standard error method. However, the standard errors I got from this method were almost the same as the original standard errors. Can I assume that the robust standard error method eliminated heteroskedasticity in my model?

. reg an_spending income age gender edu pur_freq gender_edu

Source | SS df MS Number of obs = 978
-------------+------------------------------ F( 6, 971) = 2059.19
Model | 2.4233e+10 6 4.0388e+09 Prob > F = 0.0000
Residual | 1.9045e+09 971 1961353.99 R-squared = 0.9271
-------------+------------------------------ Adj R-squared = 0.9267
Total | 2.6137e+10 977 26752609.4 Root MSE = 1400.5

------------------------------------------------------------------------------
an_spending | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
income | .0247537 .00194 12.76 0.000 .0209467 .0285608
age | 81.73533 3.261323 25.06 0.000 75.33528 88.13538
gender | -1273.963 123.972 -10.28 0.000 -1517.247 -1030.679
edu | 2267.413 127.8328 17.74 0.000 2016.553 2518.273
pur_freq | 16919.65 160.1966 105.62 0.000 16605.27 17234.02
gender_edu | 389.4784 179.4746 2.17 0.030 37.27553 741.6812
_cons | -5280.67 220.816 -23.91 0.000 -5714.001 -4847.338
------------------------------------------------------------------------------

reg an_spending income age gender edu pur_freq gender_edu, robust

Linear regression Number of obs = 978
F( 6, 971) = 1731.81
Prob > F = 0.0000
R-squared = 0.9271
Root MSE = 1400.5

------------------------------------------------------------------------------
| Robust
an_spending | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
income | .0247537 .0020335 12.17 0.000 .0207632 .0287442
age | 81.73533 3.534058 23.13 0.000 74.80006 88.6706
gender | -1273.963 127.4425 -10.00 0.000 -1524.057 -1023.868
edu | 2267.413 125.6639 18.04 0.000 2020.809 2514.017
pur_freq | 16919.65 194.0594 87.19 0.000 16538.82 17300.47
gender_edu | 389.4784 179.1607 2.17 0.030 37.89161 741.0651
_cons | -5280.67 222.5665 -23.73 0.000 -5717.437 -4843.903
------------------------------------------------------------------------------
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35754
#2

09 Dec 2023, 05:19

I don't think you can ever eliminate heteroscedasticity unless you manage to do that by transforming the data. What you can do is choose a model and/or estimation procedure suitable for your data.

The similarity of robust and conventional standard errors is no doubt comforting, but it's impossible to tell from these results whether you have done as much as you can in working towards a suitable model. For one, whether a linear functional form y = Xb is a good idea needs some independent checks. I've often found that added variable plots are helpful.
2 likes
Comment
Richard Williams

Join Date: Apr 2014

Posts: 5011
#3

09 Dec 2023, 06:46

Here is my handout on hetero:

https://www3.nd.edu/~rwilliam/stats2/l25.pdf

As Nick suggests, what tests suggest is hetero may actually be a problem with model specification. Variables may need to be transformed, or variables (e.g. interaction terms) may need to be added. Using robust standard errors may seem like a nice quick solution, but it isn’t necessarily the correct one.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://academicweb.nd.edu/~rwilliam/
2 likes
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17724
#4

09 Dec 2023, 10:28

Bekhruz:
welcome to this forum.
As an aside to previous excellent advice:
1) have you already check what happen if you add -age- square (by the way, why creating interaction by hand when you can rely on -fvvarlist- notation?):

Code:

reg an_spending income c.age##c.age gender edu pur_freq i.gender##i.edu

;
2) with a 978 sample size, I would check whether -vce(cluster idcode)- is the way to go. If you detect heteroskedasticity and autocorrelation of the epsilon, -vce(cluster idcode)- rules (see https://www.stata.com/bookstore/envi...s-using-stata/ pages 28-30);
3) your sky-rocketing R2 casts some other doubts on the correct specification of your regression model.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Bekhruz Nuraliev

Join Date: Dec 2023

Posts: 2
#5

09 Dec 2023, 15:29

Thank you very much. I will check the specification of my model
Comment

Announcement

Heteroskedasticity

Comment

Comment

Comment

Comment