Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with generating higher moments for variables from survey data



    Hello,

    First time poster (and < 1 year Stata user) so feel free to suggest improvements for posting etiquette!

    Here's my question: how can I calculate higher moments (in particular, Pearson's moment coefficient of skewness) for variables in such a way that it accounts for inverse probability survey weights)?

    I'm working with survey data (Stata 14.1 on Mac OS X 10.11.2) but do not know how to incorporate svy or pweight into commands that generate higher moments for survey variables.

    The following code runs but does not account for survey weights (pweights)
    Code:
    summarize incentive if aco_analytic_X1 == 1, detail
    (FYI: aco_analytic_X1 indicates whether the observation is part of my analytic sample, i.e., does not have any missing data.)

    Meanwhile, when I enter
    Code:
    svy: summarize incentive if aco_analytic_X1 == 1, detail
    Stata issued this error message: summarize is not supported by svy with vce(linearized); see help svy estimation for a list of Stata estimation commands that are supported by svy

    I've tried Nick Cox's -moments- but it appears that it does not support pweight. When I enter:

    Code:
    moments incentive if aco_analytic_X1 == 1 [pweight=fwt]
    Stata issued this error message: pweight not allowed

    In an ideal world, I would incorporate such code into the overall command framework that I'm using for my analysis. I'm using a loop structure with -putexcel- to create summary statistics (mean and sd) into an excel document.
    Code:
    ********************************************************************************
    **                        Table 1. Summary descriptives                           **
    ********************************************************************************
    
    ************ Summarize DVs from respective analytic samples ********************
    putexcel set table1, replace    
    
    loc row = 2
    local n = 0
    foreach x in $dv {
        if `x'_analytic_X1 == 1 {
            local n = `n' + 1
            svy: mean `x' 
            estat sd
            putexcel A`row'=("`x'") B`row'=matrix(e(b)) C`row'=matrix(r(sd))
            loc row = `row' + 1
            }
        }
    
    *************** Summarize covariates from ACO analytic sample ******************
    
    putexcel A1=("variable") B1=("mean") C1=("sd")
    loc row = 6
    local n = 0
    foreach x in $X1 {
        if aco_analytic_X1 == 1 {
            local n = `n' + 1
            svy: mean `x' 
            estat sd
            putexcel A`row'=("`x'") B`row'=matrix(e(b)) C`row'=matrix(r(sd))
            loc row = `row' + 1
            }
        }
    However, as I only *really * need higher moments for a few key variables, this is a second-order requirement. Most simply, I'd like to accurately calculate Pearson's moment coefficient of skewness for the "incentive" variable.

    Thanks!
    Adam

  • #2
    Hello Adam,

    Welcome to the Stata Forum.

    I'm not sure if I understood correctly your query, but if you mean the product moment correlation statistic (r), an alternative, as suggested in a referencial book ( Heering, West, Berglund. Applied Survey Data Analysis, CRC Press, page135-136), is standardizing both variables and then applying a simple linear regression with the proportional weight estimation.

    Hopefully that helps.

    Best,

    Marcos
    Best regards,

    Marcos

    Comment


    • #3
      A note on moments (SSC): you are correct. The help does explain: "aweights and fweights are allowed", meaning that other kinds aren't. The restriction is that of summarize, for which moments is just a wrapper.

      Comment


      • #4
        Nick, thank you for confirming that moments (SSC), like summarize, doesn't allow for weights! Do you happen to know if there is some way to calculate the skewness of a variable while accounting for pweights?

        Marcos, sorry I think I may have been unclear - I was hoping to calculate higher moments such as skewness and kurtosis for variables derived from survey data. This issue is that the commands I'm familiar with (such as svy) that are intended to deal with survey weights appear difficult to combine with commands intended to calculate properties such as skewness summarize, details.

        Thanks,
        Adam

        Comment


        • #5
          summarize doesn't purport to calculate standard errors, so I can only suggest using fweights instead. Others here are better qualified to comment on what's right and what's wrong on that. It's a personal maximum that whatever you guess wildly about [SVY] problems is likely to be wrong.

          Comment


          • #6
            Adam Markovitz : sorry for the misunderstanding, Adam. Now I see what you want. Just for the sake of finding some solution to your query, feeble as it may be, I wonder if you couldn't check skewness by having a graphical display (some say it's more reliable than the calculations). Perhaps a histogram or a boxplot with the weights would provide sort of "proxy" for your demands. I mean, you could check skewness this way. For the boxplot, we can insert the pweights directly and be sure we're "abiding by" the survey design. For the histogram, the book I mentioned in #2 used fweights, and this was already suggested by Nick in #5. It the variable has decimals, for example, it was converted into integers. On this matter, this is the furthest I can go.

            Hopefully that helps.

            Best,

            Marcos
            Best regards,

            Marcos

            Comment

            Working...
            X