mean comparison with sample weights (pweight)

Daniel Graeber

Join Date: Jan 2015

Posts: 36
#1

mean comparison with sample weights (pweight)

14 Jan 2021, 02:25

Dear all,

I want to perform mean comparisons across groups. I also want to apply sample weights (pweight). That works fine. Now, I want to allow for unequal variances across groups as a robustness check. I want to do this with sample weights. But I did not find any command that allows this procedure with sample weights. Thus, I performed a regression, including two groups, with a group indicator as regressor. Further, I apply analytic weights and robust standard errors.

Is my solution appropriate? Thank you very much.

Best

Daniel
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17700

14 Jan 2021, 04:02

Daniel:
you may want to consider something along the following lines (please note that -pweight- are usually applied in survey statistics):

Code:

use "C:\Program Files\Stata16\ado\base\a\auto.dta"

. g prob=22/74 if foreign==1
(52 missing values generated)

. replace prob=52/74 if foreign==0
(52 real changes made)

. tab prob

       prob |      Freq.     Percent        Cum.
------------+-----------------------------------
   .2972973 |         22       29.73       29.73
   .7027027 |         52       70.27      100.00
------------+-----------------------------------
      Total |         74      100.00

. regress price mpg [pw=1/prob]
(sum of wgt is 147.9999997686673)

Linear regression                               Number of obs     =         74
                                                F(1, 72)          =      14.72
                                                Prob > F          =     0.0003
                                                R-squared         =     0.2356
                                                Root MSE          =     2508.4

------------------------------------------------------------------------------
             |               Robust
       price |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         mpg |  -222.8268   58.08595    -3.84   0.000     -338.619   -107.0346
       _cons |   11197.55   1417.579     7.90   0.000     8371.659    14023.44
------------------------------------------------------------------------------

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Joro Kolev

Join Date: Aug 2018

Posts: 3050
#3

14 Jan 2021, 04:18

Yes, you can use -regress- to do this, however I do not see why you should be switching from pw to aw, just keep on using pw and add the robust option.

Or you can declare your data as survey data, as explained here:

https://stats.idre.ucla.edu/stata/fa...h-survey-data/
1 like
Comment
Daniel Graeber

Join Date: Jan 2015

Posts: 36
#4

14 Jan 2021, 06:08

Thanks Carlo and Joro. I will switch to probability weights then. You convinced me. On a note: I already treid both and results change in the third or fourth digit. But I assume probability weights are closer to the truth then.

Best

Daniel
Comment

Daniel Graeber

Join Date: Jan 2015
Posts: 36

14 Jan 2021, 06:11

Originally posted by Carlo Lazzaro View Post

Daniel:
you may want to consider something along the following lines (please note that -pweight- are usually applied in survey statistics):

Code:

use "C:\Program Files\Stata16\ado\base\a\auto.dta"

. g prob=22/74 if foreign==1
(52 missing values generated)

. replace prob=52/74 if foreign==0
(52 real changes made)

. tab prob

prob | Freq. Percent Cum.
------------+-----------------------------------
.2972973 | 22 29.73 29.73
.7027027 | 52 70.27 100.00
------------+-----------------------------------
Total | 74 100.00

. regress price mpg [pw=1/prob]
(sum of wgt is 147.9999997686673)

Linear regression Number of obs = 74
F(1, 72) = 14.72
Prob > F = 0.0003
R-squared = 0.2356
Root MSE = 2508.4

------------------------------------------------------------------------------
| Robust
price | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
mpg | -222.8268 58.08595 -3.84 0.000 -338.619 -107.0346
_cons | 11197.55 1417.579 7.90 0.000 8371.659 14023.44
------------------------------------------------------------------------------

.

Interesting,...... In your example, pweight and (aweight & robust) deliver the same results. Fascinating.

Comment

daniel klein

Join Date: Mar 2014

Posts: 3842
#6

14 Jan 2021, 06:24

Originally posted by Daniel Graeber View Post

Interesting,...... In your example, pweight and (aweight & robust) deliver the same results. Fascinating.

Well, pweights and aweights generally result in the same point estimates*; robust standard errors are based on the score/predicted values, which are based on those point estimates; using pweight implies vce(robust).

Edit: * This is not really true, e.g., estimating totals; see [U] 20.24 Weighted estimation

Last edited by daniel klein; 14 Jan 2021, 06:31.
1 like
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17700
#7

14 Jan 2021, 06:27

Thanks, Daniel:
I was late to the party!

Kind regards,
Carlo
(Stata 19.0)
Comment
Daniel Graeber

Join Date: Jan 2015

Posts: 36
#8

14 Jan 2021, 06:54

Originally posted by daniel klein View Post

Well, pweights and aweights generally result in the same point estimates*; robust standard errors are based on the score/predicted values, which are based on those point estimates; using pweight implies vce(robust).

Edit: * This is not really true, e.g., estimating totals; see [U] 20.24 Weighted estimation

Thanks, I was referring to the standard errors ;-).
Comment

Announcement