HI all, I'm analyzing survey data. I have a series of proportions and would like to produce standard errors and 90% CIs around the proportions. I've accounted for the survey design and incorporated replicate weights. Due to the sample size, this analysis produces some categories with very small numbers of people. As others have pointed out, and supported by the below code, proportion and tabulate seem to be producing the same proportions and standard errors, but the 90% CIs differ. (Note: in the below code, the category subpop_7 in the proportion results should match the results from the tabulate command). My questions:
1) I am leaning toward using the tabulate results for the CIs, as some of the CIs using the "proportion" option are negative. Thoughts?
2) Are there other ways of calculating the 90% CIs in Stata for survey proportions that I should consider here?
3) Are there any good applied research studies with good examples of how to present the proportions and SEs or 90% CIs? (tables or graphs) - maybe not a Stata question.
thanks in advance for any advice!
1) I am leaning toward using the tabulate results for the CIs, as some of the CIs using the "proportion" option are negative. Thoughts?
2) Are there other ways of calculating the 90% CIs in Stata for survey proportions that I should consider here?
3) Are there any good applied research studies with good examples of how to present the proportions and SEs or 90% CIs? (tables or graphs) - maybe not a Stata question.
thanks in advance for any advice!
Code:
. svyset [pw = WTSURVY], jkrw(RW0001- RW0320, multiplier(0.05)) vce(jack) mse
pweight: WTSURVY
VCE: jackknife
MSE: on
jkrweight: RW0001 .. RW0320
Single unit: missing
Strata 1: <one>
SU 1: <observations>
FPC 1: <zero>
. svy: proportion RACETHM_n, over(career_stage_rev2 DGRDG_n) level(90)
(running proportion on estimation sample)
Jackknife replications (320)
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
.................................................. 50
.................................................. 100
.................................................. 150
.................................................. 200
.................................................. 250
.................................................. 300
....................
Survey: Proportion estimation
Number of strata = 1 Number of obs = 1,311
Population size = 252,142.35
Replications = 320
Design df = 319
AsianNHOPI: RACETHM_n = AsianNHOPI
AIAN: RACETHM_n = AIAN
Black: RACETHM_n = Black
Hispanic: RACETHM_n = Hispanic
White: RACETHM_n = White
MR: RACETHM_n = MR
Over: career_stage_rev2 DGRDG_n
_subpop_1: 20 or more years Bachelors
_subpop_2: 20 or more years Masters
_subpop_3: 20 or more years Doctorate
_subpop_4: 20 or more years Professional
_subpop_5: Less than 20 yrs Bachelors
_subpop_6: Less than 20 yrs Masters
_subpop_7: Less than 20 yrs Doctorate
_subpop_8: Less than 20 yrs Professional
--------------------------------------------------------------
| Jknife *N ormal
Over | Proportion Std. Err. [90% Conf. Interval]
-------------+------------------------------------------------
AsianNHOPI |
_subpop_1 | .0232649 .0103291 .0062255 .0403043
_subpop_2 | .0101458 .0081955 -.0033739 .0236655
_subpop_3 | .0861882 .0234436 .0475145 .1248618
_subpop_4 | 0 (no observations)
_subpop_5 | .1010706 .025582 .0588694 .1432719
_subpop_6 | .1334251 .0323168 .0801139 .1867364
_subpop_7 | .2483284 .043813 .1760524 .3206043
_subpop_8 | 0 (no observations)
-------------+------------------------------------------------
AIAN |
_subpop_1 | 0 (no observations)
_subpop_2 | .022717 .0171829 -.0056286 .0510626
_subpop_3 | 0 (no observations)
_subpop_4 | 0 (no observations)
_subpop_5 | .000104 .000122 -.0000973 .0003053
_subpop_6 | .0080136 .005543 -.0011304 .0171576
_subpop_7 | 0 (no observations)
_subpop_8 | 0 (no observations)
-------------+------------------------------------------------
Black |
_subpop_1 | .0325514 .0203369 -.0009974 .0661001
_subpop_2 | .0865779 .0572381 -.0078446 .1810005
_subpop_3 | .0072528 .0054652 -.0017628 .0162684
_subpop_4 | 0 (no observations)
_subpop_5 | .0464535 .0292895 -.0018638 .0947708
_subpop_6 | .0848761 .0471426 .0071076 .1626445
_subpop_7 | .0030085 .0018134 .000017 .006
_subpop_8 | 0 (no observations)
-------------+------------------------------------------------
Hispanic |
_subpop_1 | .0366649 .0248132 -.0042681 .0775978
_subpop_2 | .0493453 .0213093 .0141927 .084498
_subpop_3 | .0232171 .0143399 -.0004386 .0468728
_subpop_4 | 0 (no observations)
_subpop_5 | .0834066 .0350203 .0256355 .1411777
_subpop_6 | .0727584 .0242182 .032807 .1127099
_subpop_7 | .0743311 .0250366 .0330296 .1156325
_subpop_8 | .2790089 .2699777 -.1663584 .7243761
-------------+------------------------------------------------
White |
_subpop_1 | .8807481 .043279 .809353 .9521431
_subpop_2 | .8079233 .0656598 .699608 .9162386
_subpop_3 | .880132 .0284381 .8332192 .9270448
_subpop_4 | 1 . . .
_subpop_5 | .7615289 .0511107 .6772145 .8458433
_subpop_6 | .6686341 .0495443 .5869037 .7503645
_subpop_7 | .6694451 .0474141 .5912287 .7476614
_subpop_8 | .2771716 .2752663 -.1769198 .731263
-------------+------------------------------------------------
MR |
_subpop_1 | .0267708 .0221536 -.0097747 .0633164
_subpop_2 | .0232907 .0153762 -.0020746 .048656
_subpop_3 | .0032099 .0024432 -.0008206 .0072404
_subpop_4 | 0 (no observations)
_subpop_5 | .0074364 .0028953 .0026601 .0122126
_subpop_6 | .0322927 .0186794 .0014783 .063107
_subpop_7 | .004887 .003178 -.0003555 .0101295
_subpop_8 | .4438195 .2456526 .0385802 .8490589
--------------------------------------------------------------
.
. svy, subpop(if career_stage_rev2==2 & DGRDG_n==3): tabulate RACETHM_n, se ci level(90)
(running tabulate on estimation sample)
Number of strata = 1 Number of obs = 1,311
Population size = 252,142.35
Subpop. no. obs = 241
Subpop. size = 43,459.37
Replications = 320
Design df = 319
----------------------------------------------------------
RACETHM_n | proportion se lb ub
----------+-----------------------------------------------
AsianNHO | .2483 .0438 .1832 .3273
AIAN | 0 0
Black | .003 .0018 .0011 .0081
Hispanic | .0743 .025 .0422 .1277
White | .6694 .0474 .5872 .7425
MR | .0049 .0032 .0017 .0142
|
Total | 1
----------------------------------------------------------
Key: proportion = cell proportion
se = jackknife standard error of cell proportion
lb = lower 90% confidence bound for cell proportion
ub = upper 90% confidence bound for cell proportion
Table contains a zero in the marginals.
Statistics cannot be computed.
