Manually replicating standard errors obtained with reg [aw]

Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#1

Manually replicating standard errors obtained with reg [aw]

23 Aug 2024, 00:06

I would like to replicate the coefficients and standard errors obtained from the regression reg y x [aw=inv_w] by using the explanation used here: https://www.stata.com/support/faqs/s...ar-regression/

I attach a very simple minimal example, where I first calculate weights to balance a male covariate across two samples, and then I run a weighted regression using analytical weights.

I followed the first part of the explanation and I manage to obtain the same OLS coefficients, but I am not sure how to get the same standard errors (I am not sure what n_bar is in (1), and how exactly to use it to get identical standard errors).

Any help is very much appreciated,
SL
Attached Files

mwe_data.dta (14.3 KB, 1 view)

STATA_mwe.do (788 Bytes, 1 view)
Tags: regression
George Ford

Join Date: Aug 2014

Posts: 3138
#2

23 Aug 2024, 10:21

The code in STATA_mwe produces identical results across the two approaches. Is there something missing?
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#3

23 Aug 2024, 11:34

Stefano:
as an aside to George's hepful reply, skimming trough the same source you quoted, n_bar is the reciprocal of N/Σ_k*n_k,if this is the core of your question.

Kind regards,
Carlo
(Stata 19.0)
Comment
Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#4

23 Aug 2024, 11:44

They are indeed very similar but not exactly identical, I am not sure whether this has to do with numerical precision? (STATA is unlikely to estimate reg y x [aw] by multiplying all terms as done in the example.)
Comment
Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#5

23 Aug 2024, 11:45

Originally posted by Carlo Lazzaro View Post

Stefano:
as an aside to George's hepful reply, skimming trough the same source you quoted, n_bar is the reciprocal of N/Σ_k*n_k,if this is the core of your question.

Thanks, do you understand it as the sum of the weights in the whole sample? I don't see k defined anywhere in the document.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#6

23 Aug 2024, 12:10

Stefano:
a better expanation seems to be reported in FAQ: summarize and aweights and pweights | Stata

Kind regards,
Carlo
(Stata 19.0)
Comment
George Ford

Join Date: Aug 2014

Posts: 3138
#7

23 Aug 2024, 13:36

It is a precision issue.

When you create the weighted series, add double after gen:

Code:

gen double `x'_w = `x'*sqrt(w_inv)

Now they are exactly the same to the decimals shown.
Comment
Stefano Lombardi

Join Date: Aug 2019

Posts: 17
#8

24 Aug 2024, 02:08

Originally posted by George Ford View Post

It is a precision issue.

When you create the weighted series, add double after gen:

Code:

gen double `x'_w = `x'*sqrt(w_inv)

Now they are exactly the same to the decimals shown.

That is correct, thanks!

Also, the same weighted regression (multiplying all terms by the square root of the weight) works for fw as well, as expected.
Both approaches lead to the same OLS vector. The difference between "fw" and "aw" is about which "N" is used at the denominator of the SSR (from which to subtract k - number of regressors - as usual): with "aw" it's just the original sample size, with "fw" it's the sum of weights (which must be integer, in this case). In the second case, N is equivalent to the number of rows we have if we expanded the dataset according to the weights.
Comment

Announcement

Manually replicating standard errors obtained with reg [aw]

Comment

Comment

Comment

Comment

Comment

Comment

Comment