Dear Stata users,
I am using Stata/MP 13.1 and I believe I have detected an issue with the command "xtreg, be" when applied on an unbalanced panel: I suspect that "xtreg, be" corresponds to WLS on the group-meaned data, and that "xtreg, be wls" correspond to OLS on the group-meaned data.
Indeed, if I type these two commands on my dataset (1355 municipalites i overs 3 years t, for a total of 3501 observations it), I get:
. xtreg y_it x_it, be
Between regression (regression on group means) Number of obs = 3501
Group variable: munibr Number of groups = 1355
R-sq: within = 0.0024 Obs per group: min = 1
between = 0.1772 avg = 2.6
overall = 0.0867 max = 3
F(1,1353) = 291.38
sd(u_i + avg(e_i.))= .0430597 Prob > F = 0.0000
------------------------------------------------------------------------------
y_it | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_it | -.0335348 .0019646 -17.07 0.000 -.0373887 -.0296809
_cons | .4034226 .0100747 40.04 0.000 .383659 .4231863
------------------------------------------------------------------------------
. xtreg y_it x_it, be wls
Between regression (regression on group means) Number of obs = 3501
Group variable: munibr Number of groups = 1355
R-sq: within = 0.0024 Obs per group: min = 1
between = 0.1621 avg = 2.6
overall = 0.0867 max = 3
F(1,1353) = 261.75
sd(u_i + avg(e_i.))= .0414943 Prob > F = 0.0000
------------------------------------------------------------------------------
y_it | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_it | -.0312599 .0019322 -16.18 0.000 -.0350503 -.0274696
_cons | .3897733 .0100087 38.94 0.000 .370139 .4094075
------------------------------------------------------------------------------
.
Now, if I run OLS on the group-meaned data computed manually, the results (on the coefficients) correspond to those of "xtreg, be wls":
. by id, sort: egen double x_i=mean(x_it)
. by id, sort: egen double y_i=mean(y_it)
. reg y_i x_i
Source | SS df MS Number of obs = 3501
-------------+------------------------------ F( 1, 3499) = 676.92
Model | 1.16445331 1 1.16445331 Prob > F = 0.0000
Residual | 6.01905508 3499 .001720222 R-squared = 0.1621
-------------+------------------------------ Adj R-squared = 0.1619
Total | 7.18350839 3500 .002052431 Root MSE = .04148
------------------------------------------------------------------------------
y_i | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_i | -.0312599 .0012015 -26.02 0.000 -.0336156 -.0289042
_cons | .3897733 .0062238 62.63 0.000 .3775707 .4019759
------------------------------------------------------------------------------
Am I wrong or is there really an issue with "xtreg, be" command? Thank you in advance for coming back to me.
Best,
Geoffrey
I am using Stata/MP 13.1 and I believe I have detected an issue with the command "xtreg, be" when applied on an unbalanced panel: I suspect that "xtreg, be" corresponds to WLS on the group-meaned data, and that "xtreg, be wls" correspond to OLS on the group-meaned data.
Indeed, if I type these two commands on my dataset (1355 municipalites i overs 3 years t, for a total of 3501 observations it), I get:
. xtreg y_it x_it, be
Between regression (regression on group means) Number of obs = 3501
Group variable: munibr Number of groups = 1355
R-sq: within = 0.0024 Obs per group: min = 1
between = 0.1772 avg = 2.6
overall = 0.0867 max = 3
F(1,1353) = 291.38
sd(u_i + avg(e_i.))= .0430597 Prob > F = 0.0000
------------------------------------------------------------------------------
y_it | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_it | -.0335348 .0019646 -17.07 0.000 -.0373887 -.0296809
_cons | .4034226 .0100747 40.04 0.000 .383659 .4231863
------------------------------------------------------------------------------
. xtreg y_it x_it, be wls
Between regression (regression on group means) Number of obs = 3501
Group variable: munibr Number of groups = 1355
R-sq: within = 0.0024 Obs per group: min = 1
between = 0.1621 avg = 2.6
overall = 0.0867 max = 3
F(1,1353) = 261.75
sd(u_i + avg(e_i.))= .0414943 Prob > F = 0.0000
------------------------------------------------------------------------------
y_it | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_it | -.0312599 .0019322 -16.18 0.000 -.0350503 -.0274696
_cons | .3897733 .0100087 38.94 0.000 .370139 .4094075
------------------------------------------------------------------------------
.
Now, if I run OLS on the group-meaned data computed manually, the results (on the coefficients) correspond to those of "xtreg, be wls":
. by id, sort: egen double x_i=mean(x_it)
. by id, sort: egen double y_i=mean(y_it)
. reg y_i x_i
Source | SS df MS Number of obs = 3501
-------------+------------------------------ F( 1, 3499) = 676.92
Model | 1.16445331 1 1.16445331 Prob > F = 0.0000
Residual | 6.01905508 3499 .001720222 R-squared = 0.1621
-------------+------------------------------ Adj R-squared = 0.1619
Total | 7.18350839 3500 .002052431 Root MSE = .04148
------------------------------------------------------------------------------
y_i | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x_i | -.0312599 .0012015 -26.02 0.000 -.0336156 -.0289042
_cons | .3897733 .0062238 62.63 0.000 .3775707 .4019759
------------------------------------------------------------------------------
Am I wrong or is there really an issue with "xtreg, be" command? Thank you in advance for coming back to me.
Best,
Geoffrey
Comment