I would like to run an analyses on all youth included in the 2010-2014 Hospital Cost and Utilization Project National Emergency Department data. These datasets are too large for my computer to run simple regression codes despite dropping any variable I do not need. Because I am only interested youth, I would like to be able to drop the adult observations while still using the survey weights. I am unable to run analyses using the subpop option given how much memory is required. I understand the standard errors will be incorrect if you drop observations from survey data. HCUP provides some guidance for subsetting data "The alternate method for calculating appropriate standard errors is to subset the nationwide database to the observations of interest. Then, append one "dummy" observation for each of the hospitals included in the nationwide database that is not represented in the subset. The dummy observations ensure that all the hospitals in the sample are taken into account, resulting in the accurate calculation of standard error," however, they use the SAS codes. Anyone familiar with using a subset of survey data in Stata while still getting correct SEs and not using subpop or able to point me in the right direction?
Please let me know if any of this is unclear, I apologize in advance.
Thank you!
Please let me know if any of this is unclear, I apologize in advance.
Thank you!
Comment