Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trouble getting confidence interval/margin of error for "counts"

    I am using the 3 year 2011-2013 American Community Survey Public Use Microdata data from IPUMS. We want to use STATA to calculate confidence intervals (and eventually margins of error) on counts within a specific variable (for instance # of men and # of women within the Sex variable). We figured out how to get confidence intervals on the averages of the Sex variable using the ci command, but are stumped on how to get it for counts. Ideally, we could get STATA to seamlessly crank confidence intervals for the data “observations" (i.e. Male/Female) within a variable (i.e. Sex), but we know we can get at that using "keep" or "if".

    We did find a nice spreadsheet from the Census Bureau that tells us the margin of error for different variables for that dataset, so we have something to “check” against. Unfortunately, it doesn’t tell us the margins of error for what we eventually want to get to (detailed occupational category totals and wage/salary info for them).

    Thanks,
    Anne

  • Steve Samuels
    replied
    In a previous post, you stated that you were using Stata 11, whereas SDR (successive difference replication) capability first showed up in Stata 12. Stata 11 can analyze balanced repeated replicates (BRR) replicates. If you do a BRR analysis on SDR replicates, and compare to the correct SDR results, you will find that the BRR standard errors are too small by a factor of 2. Therefore you can get correct results if you multiply the BRR standard errors by 2. The code below uses a small program to automate this conversion. (The program multiplies variances by 4.)

    Code:
    /* Write a conversion program "brr_to_sdr" */
    capture program drop _all
    
    program define brr_to_sdr, eclass
        matrix b = e(b)
        matrix V = 4*e(V)
        ereturn post b V
    end
    
    use http://www.stata-press.com/data/r14/ss07ptx
    
    /*SDR Analysis */
    svyset [pw = pwgtp] , sdrweight(pwgtp*) vce(sdr)
    svy: mean agep
    estimates store orig_sdr
    
    /* BRR Analysis */
    svyset [pw = pwgtp] , brrweight(pwgtp*) vce(brr)
    svy:  mean agep
    
    /*Run  conversion program */
    brr_to_sdr
    
    /* Display Results  & compare to SDR*/
    ereturn display // BRR conversion
    estimates replay orig_sdr
    Here are the results of the last two commands: the standard errors & CIs are identical.

    Code:
    . ereturn display  // BRR conversion
    ------------------------------------------------------------------------------
                 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
            agep |   34.24496   .0343891   995.81   0.000     34.17756    34.31236
    ------------------------------------------------------------------------------
    . estimates replay orig_sdr
    ----------------------------------------------------------------------------------------------------------------------------
    Model orig_sdr
    ----------------------------------------------------------------------------------------------------------------------------
    Survey: Mean estimation          Number of obs   =     230,817
                                     Population size =  23,904,380
                                     Replications    =          81
    
    --------------------------------------------------------------
                 |                 SDR
                 |       Mean   Std. Err.     [95% Conf. Interval]
    -------------+------------------------------------------------
            agep |   34.24496   .0343891      34.17756    34.31236
    --------------------------------------------------------------
    Last edited by Steve Samuels; 11 Mar 2016, 12:39.

    Leave a comment:


  • Steve Samuels
    replied
    There's no need to do anything by hand. There is an example of the svyset statement for the ACS in the Manual entry for svy sdr. However the names of the base and replicate weights are probably"perwt" and "repwtp". To be sure, you'll just have to check the variables names in your data.

    See:
    https://usa.ipums.org/usa-action/var.../group?id=tech
    and
    http://answers.popdata.org/What-diff...a-q834466.aspx
    Last edited by Steve Samuels; 10 Mar 2016, 16:52.

    Leave a comment:


  • Anne Gadwa Nicodemus
    replied
    Thanks, Steve! I've just been pouring over the technical documentation. It seems that there are highly tailored formulas that I doubt I could get STATA to automate. Looks like I will be hand cranking these stats. https://www.census.gov/programs-surv...tion.2013.html

    Leave a comment:


  • Steve Samuels
    replied
    ci will give incorrect confidence intervals for the ACS, because it ignores the survey design.

    Leave a comment:


  • Clyde Schechter
    replied
    Have a look at the -total- command. -total-ing a 0/1 variable gives you counts. Also, since it seems you are using survey data, it is helpful that -total- works nicely with the -svy:- prefix.

    Leave a comment:

Working...
X