Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Graphing svy tabulation results including 95% confidence intervals

    Hello all,

    I frequently find that I want to produce line graphs of times series of categorical variables using survey data.

    For instance, to just use a simple example, let's say I want to use the American Community Survey (a large, nationally-representative household survey of the US population) to produce a line graph of the annual rate of uninsurance among 4 different race/ethnic groups. Here, race4cat is a 4-level categorical variable of race. Uninsured = 1 if uninsured and 0 if insured. I have already svyset the data using pweights, cluster, and strata variables. Year is 2009-2018.

    Code:
    svy, subpop(if race4cat == 1): tabulate uninsured year, column percent
    svy, subpop(if race4cat == 2): tabulate uninsured year, column percent
    svy, subpop(if race4cat == 3): tabulate uninsured year, column percent
    svy, subpop(if race4cat == 4): tabulate uninsured year, column percent
    If I want to generate a line graph where the y-axis is percentage, and x-axis is year, it seems to me there is no easy way to "use" the results from the svy: tabulation command. The only way I have managed to do this is by collapsing the data using the mean values by race and year, which I believe works because the weighted mean of a binary (0/1) variable produces the percent. (I would use 'sum' instead of 'mean' if I wanted the weighted sum of uninsured individuals instead):

    Code:
    collapse (mean) uninsuredavg=uninsured [pweight=perwt], by(year race4cat)
    I then produce the graph with:

    Code:
     twoway (connected uninsuredavg year if   race4cat == 1) ///
               (connected uninsuredavg year if   race4cat == 2), ///
               (connected uninsuredavg year if   race4cat == 3), ///
               (connected uninsuredavg year if   race4cat == 4)
    However, while this graphs means, it does not graph 95% confidence intervals. The collapse command allows you to generate standard errors, although not with pweights which are essential to use with this data.

    My two questions are:
    1. Is there a less roundabout way to graph survey tabulation results than collapsing the database as I have done?
    2. Assuming no, is there a way to graph confidence intervals?

    Many thanks in advance!

    Adam
Working...
X