Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Rescaling -pweights- doesn't alter coefficents?

    Hello,

    I was using survey data from Statistics Canada (N = 25113), and they provide p-weights for the data set. According to the instructions in the user documentation, if regression is being used on a subset of the data, one needs to ensure that the mean for the p-weights is 1. In other words new_weight=wts_m/[mean of wts_m]. I was curious what effect this had on the overall regression model, so I compared one simple regression model with the original weights (wts_m) against another simple regression model with re-calculated weights (pw_c).

    Code:
    . regress distress dhhgage [pw=wts_m]
    (sum of wgt is   2.8121e+07)
    
    Linear regression                                      Number of obs =   24927
                                                           F(  1, 24925) =  273.97
                                                           Prob > F      =  0.0000
                                                           R-squared     =  0.0219
                                                           Root MSE      =   5.357
    
    ------------------------------------------------------------------------------
                 |               Robust
        distress |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         dhhgage |  -.2206369     .01333   -16.55   0.000    -.2467644   -.1945095
           _cons |   6.728471   .1171459    57.44   0.000     6.498858    6.958084
    ------------------------------------------------------------------------------
    
    . regress distress dhhgage [pw=pw_c]
    (sum of wgt is   2.5394e+04)
    
    Linear regression                                      Number of obs =   24927
                                                           F(  1, 24925) =  273.97
                                                           Prob > F      =  0.0000
                                                           R-squared     =  0.0219
                                                           Root MSE      =   5.357
    
    ------------------------------------------------------------------------------
                 |               Robust
        distress |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
         dhhgage |  -.2206369     .01333   -16.55   0.000    -.2467644   -.1945095
           _cons |   6.728471   .1171459    57.44   0.000     6.498858    6.958084
    ------------------------------------------------------------------------------
    As you can see, the two models are identical (except for the sum of the weights). My basic question is this: why would Stats Canada "insist" on re-scaling the pweights if there were no differences between the models? Or is there a difference that isn't being displayed?

    Cheers,

    David.
    Last edited by David Speed; 13 Feb 2017, 07:03.

  • #2
    Stata does the re-scaling internally for you, so the scale of the original weights does not matter. More precisely, Stata rescales pweight to sum to N [so their mean will indeed be 1]. I have no idea whether other statistical packages do the same (can e.g. SPSS even apply pweights correctly?). My guess is Statistics Canada cannot be up to date about how all possible statistical software handles weights, so they document the proper usage and leave the details to the end user and her or his software of choice. This is a guess and a definitive answer could probably only be provided by someone at Statistics Canada.

    For the Stata part of your question, to see possible differences between results, see

    Code:
    help savedresults
    Best
    Daniel
    Last edited by daniel klein; 13 Feb 2017, 07:52.

    Comment

    Working...
    X