Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • using pwmean with weighted data

    Dear all,

    Is there a way to obtain pairwise comparisons of means for weighted data? Pwmeans provides pairwise comparisons but does not allow for weights of survey data. Is there an easy of still using this command or do I have to do everything manually by defining first my svyset, compute the weighted means by mean var1, over(var2) and perform lincom tests for all possible categories of var2?

    Thanks.

    Best,
    Vincent

  • #2
    pwmeans is really just an ado program that uses regress and pwcompare.

    Here is an example to show this, but you can look at the ado-file to verify.

    Code:
    . webuse nmihs
    
    . pwmean birthwgt, over(agegrp) cimeans
    
    Pairwise comparisons of means with equal variances
    
    over         : agegrp
    
    --------------------------------------------------------------
                 |                                 Unadjusted
        birthwgt |       Mean   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
          agegrp |
       age15-19  |   2690.038   24.12827      2642.741    2737.334
       age20-24  |   2835.053   18.32823      2799.126     2870.98
       age25-29  |   2916.346   18.37643      2880.324    2952.367
       age30-34  |   2890.537   22.80055      2845.843     2935.23
         age35+  |   2842.239   36.22257      2771.235    2913.242
    --------------------------------------------------------------
    
    . quietly regress birthwgt i.agegrp
    
    . pwcompare agegrp, cimargins
    
    Pairwise comparisons of marginal linear predictions
    
    Margins      : asbalanced
    
    --------------------------------------------------------------
                 |                                 Unadjusted
                 |     Margin   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
          agegrp |
       age15-19  |   2690.038   24.12827      2642.741    2737.334
       age20-24  |   2835.053   18.32823      2799.126     2870.98
       age25-29  |   2916.346   18.37643      2880.324    2952.367
       age30-34  |   2890.537   22.80055      2845.843     2935.23
         age35+  |   2842.239   36.22257      2771.235    2913.242
    --------------------------------------------------------------
    pwmean does not allow weights, nor does it allow the svy prefix,
    but we can use svy: regress (or just regress with weights) followed
    by a call to pwcompare.

    Here is the above example, but using the svysettings that are already
    part of the nmihs.dta dataset.

    Code:
    . svyset
    
          pweight: finwgt
              VCE: linearized
      Single unit: missing
         Strata 1: stratan
             SU 1: <observations>
            FPC 1: <zero>
    
    . svy: regress birthwgt i.agegrp
    (running regress on estimation sample)
    
    Survey: Linear regression
    
    Number of strata   =         6                  Number of obs      =      9946
    Number of PSUs     =      9946                  Population size    = 3895561.7
                                                    Design df          =      9940
                                                    F(   4,   9937)    =     25.86
                                                    Prob > F           =    0.0000
                                                    R-squared          =    0.0131
    
    ------------------------------------------------------------------------------
                 |             Linearized
        birthwgt |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
          agegrp |
       age20-24  |   109.8123   22.52992     4.87   0.000     65.64906    153.9755
       age25-29  |   194.7883   22.47066     8.67   0.000     150.7412    238.8353
       age30-34  |   208.1622   24.36449     8.54   0.000     160.4028    255.9215
         age35+  |    193.807   34.06907     5.69   0.000     127.0247    260.5893
                 |
           _cons |   3205.137   18.59472   172.37   0.000     3168.688    3241.587
    ------------------------------------------------------------------------------
    
    . pwcompare agegrp, cimargins
    
    Pairwise comparisons of marginal linear predictions
    
                                                    Design df          =      9940
    
    Margins      : asbalanced
    
    --------------------------------------------------------------
                 |                                 Unadjusted
                 |     Margin   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
          agegrp |
       age15-19  |   3205.137   18.59472      3168.688    3241.587
       age20-24  |   3314.949   12.12495      3291.182    3338.717
       age25-29  |   3399.925   12.07594      3376.254    3423.597
       age30-34  |   3413.299   15.28501      3383.338    3443.261
         age35+  |   3398.944   28.25415       3343.56    3454.328
    --------------------------------------------------------------

    Comment


    • #3
      Once again, thank you Jeff. This method works perfectly.

      Comment


      • #4
        I am facing one additional problem though. Consider that I want to compare the mean of var1 for different age categories. Based on the weighted survey datta that I have I can follow the methodology that is decribed above by Jeff.

        Hence, I would do:
        Code:
        svyset [iweight=w1]
        svy: regress var1 i.age
        pwcompare age, groups
        This works perfect. But in addition I would like to compare the mean of var1 of my whole survey weighted sample with the weighted means per age category. I would like to do this to see to what extent the mean of var1 for the whole sample is statistically different from its means for each age group separately. Is there a method to adjust the code above without getting into problems of collinearity in the regression model?

        Thanks.

        Comment


        • #5
          I think a similar problem has been discussed, but I have no idea where to find this discussion. From what I vaguely remember, the core question was whether it makes any sense to compare the mean of a group to the mean of a subsample of this group. The main argument was that you base both estimates on partly the same individuals, which does not seem sound.

          Best
          Daniel

          Comment


          • #6
            pwcompare does pairwise comparisons of marginal linear predictions.
            It is not capable of comparing the grand mean with each group mean.

            However, contrast has an operator that compares the grand marginal
            linear prediction to each marginal linear prediction.

            In fact there are two versions of the grand mean contrast operator.
            The g. operator performs simple/unweighted differences from the grand
            mean, and the gw. operator computes the weighted differences.

            Code:
            . help contrast
            Here is an exmaple, using the same dataset as before.

            Code:
            . webuse nmihs
            
            . svyset
            
                  pweight: finwgt
                      VCE: linearized
              Single unit: missing
                 Strata 1: stratan
                     SU 1: <observations>
                    FPC 1: <zero>
            
            . svy: regress birthwgt i.agegrp
            (running regress on estimation sample)
            
            Survey: Linear regression
            
            Number of strata   =         6                  Number of obs      =      9946
            Number of PSUs     =      9946                  Population size    = 3895561.7
                                                            Design df          =      9940
                                                            F(   4,   9937)    =     25.86
                                                            Prob > F           =    0.0000
                                                            R-squared          =    0.0131
            
            ------------------------------------------------------------------------------
                         |             Linearized
                birthwgt |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  agegrp |
               age20-24  |   109.8123   22.52992     4.87   0.000     65.64906    153.9755
               age25-29  |   194.7883   22.47066     8.67   0.000     150.7412    238.8353
               age30-34  |   208.1622   24.36449     8.54   0.000     160.4028    255.9215
                 age35+  |    193.807   34.06907     5.69   0.000     127.0247    260.5893
                         |
                   _cons |   3205.137   18.59472   172.37   0.000     3168.688    3241.587
            ------------------------------------------------------------------------------
            
            . contrast g.agegrp, nowald
            
            Contrasts of marginal linear predictions
            
                                                            Design df          =      9940
            
            Margins      : asbalanced
            
            ---------------------------------------------------------------------
                                |   Contrast   Std. Err.     [95% Conf. Interval]
            --------------------+------------------------------------------------
                         agegrp |
            (age15-19 vs mean)  |  -141.3139   16.74263     -174.1329    -108.495
            (age20-24 vs mean)  |  -31.50168   12.67539     -56.34802   -6.655341
            (age25-29 vs mean)  |   53.47434   12.61681      28.74283    78.20584
            (age30-34 vs mean)  |   66.84821   14.57578      38.27673    95.41969
              (age35+ vs mean)  |   52.49308   23.50873      6.411203    98.57495
            ---------------------------------------------------------------------
            
            . contrast gw.agegrp, nowald
            
            Contrasts of marginal linear predictions
            
                                                            Design df          =      9940
            
            Margins      : asbalanced
            
            ---------------------------------------------------------------------
                                |   Contrast   Std. Err.     [95% Conf. Interval]
            --------------------+------------------------------------------------
                         agegrp |
            (age15-19 vs mean)  |  -150.3153   17.72641     -185.0627    -115.568
            (age20-24 vs mean)  |  -40.50306   10.82843     -61.72898   -19.27714
            (age25-29 vs mean)  |   44.47296   10.13777      24.60087    64.34504
            (age30-34 vs mean)  |   57.84683   13.74461      30.90461    84.78905
              (age35+ vs mean)  |    43.4917   26.93053     -9.297601    96.28099
            ---------------------------------------------------------------------

            Comment


            • #7
              Two of the threads that Daniel may have been thinking of are:

              http://www.statalist.org/forums/foru...tandard-errors

              and

              http://www.stata.com/statalist/archi.../msg00296.html
              Steve Samuels
              Statistical Consulting
              [email protected]

              Stata 14.2

              Comment


              • #8
                Thanks for the threads and the proposed method! They are really helpful!

                Comment

                Working...
                X