Trouble getting confidence interval/margin of error for "counts"

Anne Gadwa Nicodemus

Join Date: Sep 2015

Posts: 4
#1

Trouble getting confidence interval/margin of error for "counts"

10 Mar 2016, 13:06

I am using the 3 year 2011-2013 American Community Survey Public Use Microdata data from IPUMS. We want to use STATA to calculate confidence intervals (and eventually margins of error) on counts within a specific variable (for instance # of men and # of women within the Sex variable). We figured out how to get confidence intervals on the averages of the Sex variable using the ci command, but are stumped on how to get it for counts. Ideally, we could get STATA to seamlessly crank confidence intervals for the data “observations" (i.e. Male/Female) within a variable (i.e. Sex), but we know we can get at that using "keep" or "if".

We did find a nice spreadsheet from the Census Bureau that tells us the margin of error for different variables for that dataset, so we have something to “check” against. Unfortunately, it doesn’t tell us the margins of error for what we eventually want to get to (detailed occupational category totals and wage/salary info for them).

Thanks,
Anne
Tags: confidence interval, margin of error, PUMS
Clyde Schechter

Join Date: Apr 2014

Posts: 30111
#2

10 Mar 2016, 14:15

Have a look at the -total- command. -total-ing a 0/1 variable gives you counts. Also, since it seems you are using survey data, it is helpful that -total- works nicely with the -svy:- prefix.
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#3

10 Mar 2016, 15:05

ci will give incorrect confidence intervals for the ACS, because it ignores the survey design.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment
Anne Gadwa Nicodemus

Join Date: Sep 2015

Posts: 4
#4

10 Mar 2016, 15:29

Thanks, Steve! I've just been pouring over the technical documentation. It seems that there are highly tailored formulas that I doubt I could get STATA to automate. Looks like I will be hand cranking these stats. https://www.census.gov/programs-surv...tion.2013.html
Comment
Steve Samuels

Join Date: Mar 2014

Posts: 1786
#5

10 Mar 2016, 15:49

There's no need to do anything by hand. There is an example of the svyset statement for the ACS in the Manual entry for svy sdr. However the names of the base and replicate weights are probably"perwt" and "repwtp". To be sure, you'll just have to check the variables names in your data.

See:
https://usa.ipums.org/usa-action/var.../group?id=tech
and
http://answers.popdata.org/What-diff...a-q834466.aspx

Last edited by Steve Samuels; 10 Mar 2016, 15:52.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2
Comment

Steve Samuels

Join Date: Mar 2014
Posts: 1786

11 Mar 2016, 11:04

In a previous post, you stated that you were using Stata 11, whereas SDR (successive difference replication) capability first showed up in Stata 12. Stata 11 can analyze balanced repeated replicates (BRR) replicates. If you do a BRR analysis on SDR replicates, and compare to the correct SDR results, you will find that the BRR standard errors are too small by a factor of 2. Therefore you can get correct results if you multiply the BRR standard errors by 2. The code below uses a small program to automate this conversion. (The program multiplies variances by 4.)

Code:

/* Write a conversion program "brr_to_sdr" */
capture program drop _all

program define brr_to_sdr, eclass
    matrix b = e(b)
    matrix V = 4*e(V)
    ereturn post b V
end

use http://www.stata-press.com/data/r14/ss07ptx

/*SDR Analysis */
svyset [pw = pwgtp] , sdrweight(pwgtp*) vce(sdr)
svy: mean agep
estimates store orig_sdr

/* BRR Analysis */
svyset [pw = pwgtp] , brrweight(pwgtp*) vce(brr)
svy:  mean agep

/*Run  conversion program */
brr_to_sdr

/* Display Results  & compare to SDR*/
ereturn display // BRR conversion
estimates replay orig_sdr

Here are the results of the last two commands: the standard errors & CIs are identical.

Code:

. ereturn display  // BRR conversion
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        agep |   34.24496   .0343891   995.81   0.000     34.17756    34.31236
------------------------------------------------------------------------------
. estimates replay orig_sdr
----------------------------------------------------------------------------------------------------------------------------
Model orig_sdr
----------------------------------------------------------------------------------------------------------------------------
Survey: Mean estimation          Number of obs   =     230,817
                                 Population size =  23,904,380
                                 Replications    =          81

--------------------------------------------------------------
             |                 SDR
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        agep |   34.24496   .0343891      34.17756    34.31236
--------------------------------------------------------------

Last edited by Steve Samuels; 11 Mar 2016, 11:39.

Steve Samuels
Statistical Consulting
[email protected]

Stata 14.2

Announcement

Trouble getting confidence interval/margin of error for "counts"

Comment

Comment

Comment

Comment

Comment