Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to obtain standard errors of the percentage distribution of sums, using SVY commands?

    I am using survey data that requires me to use the SVY commands. (I don't think it's relevant here, but I am using CPS data with replicate weights, using the statement:
    svyset [iw=pwwgt0], sdrweight(pwwgt1-pwwgt160) vce sdr)

    I am creating sums over groups. I have a household typology (single parent families, married parent families, cohabiting families, etc.) and I am calculating the total number of children (numkids) in the population within each household type. I do this with the total command:

    svy:total numkids, over(typology)

    This gives me the totals and the standard errors of the totals.

    Now I want to calculate the percentage of children in each household type. How do I obtain the standard errors of the percentages?

    Thanks for any tips!

    Rhiannon

  • #2
    The trick is to see that you are asking for ratios, where the numerator in each is the total of kids in the category, and denominator is the number of kids in the population.

    I assume you that have six types.


    Code:
    local k = 6
    forvalues j = 1/`k'{
    gen kids`j' = numkids*(typology==`j')
    svy: ratio (p`j' :  kids`j' / numkids)
    }
    Last edited by Steve Samuels; 22 Jun 2015, 15:57.
    Steve Samuels
    Statistical Consulting
    [email protected]

    Stata 14.2

    Comment


    • #3
      Thanks so much, Steve! That worked perfectly. I really appreciate your assistance.

      Comment


      • #4
        Dear Steve,

        A got a trouble when trying to estimate ratio for my survey dataset. I would like to have a ratio of yll_cause_x/total_yll.
        My PSUs in the survey is a variable namely `village'.

        Estimation of total YLLs for all causes of death, and total YLLs for the specific cause (for example: xogan_c) are estimated as following codes
        Code:
        . svy, subpop(death): total (yll_1ca)
        (running total on estimation sample)
        
        Survey: Total estimation
        
        Number of strata =       1        Number of obs   =      4,983
        Number of PSUs   =      32        Population size =  1,853,698
                                          Subpop. no. obs =        145
                                          Subpop. size    = 10,015.567
                                          Design df       =         31
        
        --------------------------------------------------------------
                     |             Linearized
                     |      Total   Std. Err.     [95% Conf. Interval]
        -------------+------------------------------------------------
             yll_1ca |   341711.5   69820.97      199310.7    484112.3
        --------------------------------------------------------------
        *
        . svy, subpop(death): total (yll_1ca), over(xogan_c)
        (running total on estimation sample)
        
        Survey: Total estimation
        
        Number of strata =       1        Number of obs   =      4,983
        Number of PSUs   =      32        Population size =  1,853,698
                                          Subpop. no. obs =        145
                                          Subpop. size    = 10,015.567
                                          Design df       =         31
        
                    0: xogan_c = 0
                    1: xogan_c = 1
        
        --------------------------------------------------------------
                     |             Linearized
                Over |      Total   Std. Err.     [95% Conf. Interval]
        -------------+------------------------------------------------
        yll_1ca      |
                   0 |   333403.4    69038.1      192599.3    474207.6
                   1 |   8308.065   3683.384      795.7546    15820.38
        --------------------------------------------------------------
        Therefore, the expected ratio is .0243 or somewhere around it, which equals 2.4 %, means that the cause namely `xogan_c' accounting for 2.4 % of the total YLLs for all causes.
        Code:
        . di 8308.065/341711.5
        .0243131
        Code:
        . svy, subpop(death): ratio yll_xogan_c tyll
        (running ratio on estimation sample)
        
        Survey: Ratio estimation
        
        Number of strata =       1        Number of obs   =      4,983
        Number of PSUs   =      32        Population size =  1,853,698
                                          Subpop. no. obs =        145
                                          Subpop. size    = 10,015.567
                                          Design df       =         31
        
             _ratio_1: yll_xogan_c/tyll
        
        --------------------------------------------------------------
                     |             Linearized
                     |      Ratio   Std. Err.     [95% Conf. Interval]
        -------------+------------------------------------------------
            _ratio_1 |   .0001872   .0000782      .0000277    .0003467
        --------------------------------------------------------------
        The value get here .0001872 is the unexpected one, compared with the value mentioned above `.0243131'. What's wrong with those codes and how to fix them?

        I have used the following codes for generating those two variables which are numerator and denominator, the variable namely `yll_1ca' represents value of each death.
        Code:
        egen tyll = total (yll_1ca) if death == 1
        
            bys village: egen yll_xogan_c = total (yll_1ca) if death == 1 & xogan_c == 1
            replace yll_xogan_c = 0 if xogan_c == 0
        Thank you very much in advance!
        Last edited by Thong Nguyen; 29 Nov 2016, 21:16.

        Comment

        Working...
        X