Rescaling -pweights- doesn't alter coefficents?

Hello,

I was using survey data from Statistics Canada (N = 25113), and they provide p-weights for the data set. According to the instructions in the user documentation, if regression is being used on a subset of the data, one needs to ensure that the mean for the p-weights is 1. In other words new_weight=wts_m/[mean of wts_m]. I was curious what effect this had on the overall regression model, so I compared one simple regression model with the original weights (wts_m) against another simple regression model with re-calculated weights (pw_c).

Code:

. regress distress dhhgage [pw=wts_m]
(sum of wgt is   2.8121e+07)

Linear regression                                      Number of obs =   24927
                                                       F(  1, 24925) =  273.97
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.0219
                                                       Root MSE      =   5.357

------------------------------------------------------------------------------
             |               Robust
    distress |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dhhgage |  -.2206369     .01333   -16.55   0.000    -.2467644   -.1945095
       _cons |   6.728471   .1171459    57.44   0.000     6.498858    6.958084
------------------------------------------------------------------------------

. regress distress dhhgage [pw=pw_c]
(sum of wgt is   2.5394e+04)

Linear regression                                      Number of obs =   24927
                                                       F(  1, 24925) =  273.97
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.0219
                                                       Root MSE      =   5.357

------------------------------------------------------------------------------
             |               Robust
    distress |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     dhhgage |  -.2206369     .01333   -16.55   0.000    -.2467644   -.1945095
       _cons |   6.728471   .1171459    57.44   0.000     6.498858    6.958084
------------------------------------------------------------------------------

As you can see, the two models are identical (except for the sum of the weights). My basic question is this: why would Stats Canada "insist" on re-scaling the pweights if there were no differences between the models? Or is there a difference that isn't being displayed?

Cheers,

David.

Stata does the re-scaling internally for you, so the scale of the original weights does not matter. More precisely, Stata rescales pweight to sum to N [so their mean will indeed be 1]. I have no idea whether other statistical packages do the same (can e.g. SPSS even apply pweights correctly?). My guess is Statistics Canada cannot be up to date about how all possible statistical software handles weights, so they document the proper usage and leave the details to the end user and her or his software of choice. This is a guess and a definitive answer could probably only be provided by someone at Statistics Canada.

For the Stata part of your question, to see possible differences between results, see

Code:

help savedresults

Best
Daniel

Announcement

Rescaling -pweights- doesn't alter coefficents?

Leave a comment: