How to obtain standard errors of the percentage distribution of sums, using SVY commands?

Rhiannon Patterson

Join Date: Jan 2015

Posts: 5
#1

How to obtain standard errors of the percentage distribution of sums, using SVY commands?

22 Jun 2015, 14:38

I am using survey data that requires me to use the SVY commands. (I don't think it's relevant here, but I am using CPS data with replicate weights, using the statement:
svyset [iw=pwwgt0], sdrweight(pwwgt1-pwwgt160) vce sdr)

I am creating sums over groups. I have a household typology (single parent families, married parent families, cohabiting families, etc.) and I am calculating the total number of children (numkids) in the population within each household type. I do this with the total command:

svy:total numkids, over(typology)

This gives me the totals and the standard errors of the totals.

Now I want to calculate the percentage of children in each household type. How do I obtain the standard errors of the percentages?

Thanks for any tips!

Rhiannon
Tags: None
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#2

22 Jun 2015, 15:53

The trick is to see that you are asking for ratios, where the numerator in each is the total of kids in the category, and denominator is the number of kids in the population.

I assume you that have six types.

Code:

local k = 6 forvalues j = 1/`k'{ gen kids`j' = numkids*(typology==`j') svy: ratio (p`j' : kids`j' / numkids) }

Last edited by Steve Samuels; 22 Jun 2015, 15:57.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Rhiannon Patterson

Join Date: Jan 2015

Posts: 5
#3

22 Jun 2015, 16:40

Thanks so much, Steve! That worked perfectly. I really appreciate your assistance.
Comment

Thong Nguyen

Join Date: Oct 2015
Posts: 236

29 Nov 2016, 21:00

Dear Steve,

A got a trouble when trying to estimate ratio for my survey dataset. I would like to have a ratio of yll_cause_x/total_yll.
My PSUs in the survey is a variable namely `village'.

Estimation of total YLLs for all causes of death, and total YLLs for the specific cause (for example: xogan_c) are estimated as following codes

Code:

. svy, subpop(death): total (yll_1ca)
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1        Number of obs   =      4,983
Number of PSUs   =      32        Population size =  1,853,698
                                  Subpop. no. obs =        145
                                  Subpop. size    = 10,015.567
                                  Design df       =         31

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
     yll_1ca |   341711.5   69820.97      199310.7    484112.3
--------------------------------------------------------------
*
. svy, subpop(death): total (yll_1ca), over(xogan_c)
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1        Number of obs   =      4,983
Number of PSUs   =      32        Population size =  1,853,698
                                  Subpop. no. obs =        145
                                  Subpop. size    = 10,015.567
                                  Design df       =         31

            0: xogan_c = 0
            1: xogan_c = 1

--------------------------------------------------------------
             |             Linearized
        Over |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
yll_1ca      |
           0 |   333403.4    69038.1      192599.3    474207.6
           1 |   8308.065   3683.384      795.7546    15820.38
--------------------------------------------------------------

Therefore, the expected ratio is .0243 or somewhere around it, which equals 2.4 %, means that the cause namely `xogan_c' accounting for 2.4 % of the total YLLs for all causes.

Code:

. di 8308.065/341711.5
.0243131

Code:

. svy, subpop(death): ratio yll_xogan_c tyll
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1        Number of obs   =      4,983
Number of PSUs   =      32        Population size =  1,853,698
                                  Subpop. no. obs =        145
                                  Subpop. size    = 10,015.567
                                  Design df       =         31

     _ratio_1: yll_xogan_c/tyll

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .0001872   .0000782      .0000277    .0003467
--------------------------------------------------------------

The value get here .0001872 is the unexpected one, compared with the value mentioned above `.0243131'. What's wrong with those codes and how to fix them?

I have used the following codes for generating those two variables which are numerator and denominator, the variable namely `yll_1ca' represents value of each death.

Code:

egen tyll = total (yll_1ca) if death == 1

    bys village: egen yll_xogan_c = total (yll_1ca) if death == 1 & xogan_c == 1
    replace yll_xogan_c = 0 if xogan_c == 0

Thank you very much in advance!

Last edited by Thong Nguyen; 29 Nov 2016, 21:16.

Announcement

How to obtain standard errors of the percentage distribution of sums, using SVY commands?

Comment

Comment

Comment