No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dropping observation with survey weighted data

    I would like to run an analyses on all youth included in the 2010-2014 Hospital Cost and Utilization Project National Emergency Department data. These datasets are too large for my computer to run simple regression codes despite dropping any variable I do not need. Because I am only interested youth, I would like to be able to drop the adult observations while still using the survey weights. I am unable to run analyses using the subpop option given how much memory is required. I understand the standard errors will be incorrect if you drop observations from survey data. HCUP provides some guidance for subsetting data "The alternate method for calculating appropriate standard errors is to subset the nationwide database to the observations of interest. Then, append one "dummy" observation for each of the hospitals included in the nationwide database that is not represented in the subset. The dummy observations ensure that all the hospitals in the sample are taken into account, resulting in the accurate calculation of standard error," however, they use the SAS codes. Anyone familiar with using a subset of survey data in Stata while still getting correct SEs and not using subpop or able to point me in the right direction?
    Please let me know if any of this is unclear, I apologize in advance.
    Thank you!

  • #2
    You don't need to keep all the adults. See this post by Austin Nichols, which was a response to a similar question by Richard Williams 11 years ago. The data you need is described in the concluding paragraph as follows:
    Note that the "better" data just contain one obs for each stratum/psu containing the sum of weights for excluded obs, thus reducing the total size of the data.
    Steve Samuels
    Statistical Consulting

    Stata 14.2


    • #3
      It was asked exactly 11 years ago! I will try this, thank you.