Lopsided confidence intervals with svy commands and how to export table estimates with confidence intervenal

Doug Hess

Join Date: Nov 2016
Posts: 58

Lopsided confidence intervals with svy commands and how to export table estimates with confidence intervenal

21 Mar 2023, 14:15

Hello. A few related questions from one project. I am using Stata/MP 15.1.

I want to produce many tables from a voter registration and voting survey. For instance, the code below gives the voting rate (weighted and averaged over several years) for Wisconsin citizens who are either part-time or full-time high school students. The results are shown for mid-term election cycles (pres==0) and presidential cycles (pres==1).

Code:

table pres [pw=weight] if stateabb=="WI" & ptfths==1, c(mean voted) f(%5.3fc)

pres	mean(voted)

0	0.316
1	0.604

I had hoped to use the -xtable- command (with syntax similar to the above example) because it would allow me to run many such tables and export them quickly into MS Excel files with sheets for each of many subpopulations. However, I want the tables to include confidence intervals on the estimates because the samples for some subpopulations are small (n= 100 for the Wisconsin student example above). The method I know for getting confidence intervals for proportions is the -svy tabulate- command, see below, but I do not know how to export tidy tables from this command. First question: What other ways are there to produce estimates and CIs, ideally in a single table? Subquestion: What is the best way to export for any of these methods? For instance, -putdocx- doesn't seem to work for -svy tabulate-. The exported file format doesn't have to be Excel.

The second question relates to using svy tabulate to get the CIs. To set the weights, I used

Code:

svyset _n [pweight=weight]

. I then issued this command:

Code:

svy, subpop(if stateabb=="WI" & ptfths==1): tabulate pres voted, ci row

Here is the output:

(running tabulate on estimation sample)
Number of strata = 1 Number of obs	=	815,414
Number of PSUs = 815,414 Population size	=	1,842,740,528
Subpop. no. obs	=	100
Subpop. size	=	261,136.862793
Design df	=	815,413

voted
Cycle Did not vote Voted
Midterm .6841 .3159
[.5462,.7957] [.2043,.4538]
President .3963 .6037
[.2601,.5507] [.4493,.7399]
Total .5309 .4691
[.4274,.6318] [.3682,.5726]
Key: row proportion
[95% confidence interval for row proportion]
Pearson:
Uncorrected chi2(1) = 6.75e+04
Design-based F(1, 815413) = 7.6817 P = 0.0056

Second question: Why are the lower and upper CI bounds uneven around the estimated voting rate? E.g., for voting in presidential election cycles: 60.37 - 44.93 = 15.44 for the lower bound and 73.99 -60.37 = 13.62 for the upper bound. I rarely analyze small sample estimates or use the -svy- way of handling weights (I usually analyze with regression and use [pweight==weight]), I have not encountered this before. Perhaps I am using the -subpop- option of the -svy- prefix incorrectly? If there is another strategy to get the estimates and CI together, perhaps that won't produce this problem, but I still wonder what is going on here.

I also noticed if I just drop the other observations, instead of using the -subpop- option, I get a different set of bounds. The mean is the same but the bounds are different (yet, still lopsided): 74.2 - 60.37 = 13.83 & 60.37- 44.66 = 15.71. Third question: Why do these CIs differ from the output above using the -subpop- option?

Code:

  preserve
  keep if statea=="WI" & ptfths==1
  svy:tabulate pres voted, ci row
  restore

(running tabulate on estimation sample)
Number of strata = 1 Number of obs	=	100
Number of PSUs = 100 Population size	=	261,136.86
Design df	=	99

voted
Cycle Did not vote Voted
Midterm .6841 .3159
[.5437,.7974] [.2026,.4563]
President .3963 .6037
[.258,.5534] [.4466,.742]
Total .5309 .4691
[.4256,.6335] [.3665,.5744]
Key: row proportion
[95% confidence interval for row proportion]
Pearson:
Uncorrected chi2(1) = 8.2795
Design-based F(1, 99) = 7.6049 P = 0.0069

Thanks. I keep thinking I'm missing something obvious, but I don't use -svy- often. Perhaps there's a good primer out there on svy (besides what's in the manuals)?

-Doug

Tags: None

Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2457
#2

21 Mar 2023, 14:28

In answer to your second question, confidence intervals aren’t required to be symmetric. Your CI will be symmetric on the logit scale, which is used for computing proportions by svy tab, then they are transformed to the more familiar probability scale which makes them asymmetric on that scale.
1 like
Comment
Doug Hess

Join Date: Nov 2016

Posts: 58
#3

21 Mar 2023, 14:52

Originally posted by Leonardo Guizzetti View Post

In answer to your second question, confidence intervals aren’t required to be symmetric. Your CI will be symmetric on the logit scale, which is used for computing proportions by svy tab, then they are transformed to the more familiar probability scale which makes them asymmetric on that scale.

Thanks. I didn't realize -svy tab- used logit for proportions.
Comment

Announcement

Lopsided confidence intervals with svy commands and how to export table estimates with confidence intervenal

Comment

Comment