Dear Statalist members,
I am using the National Health and Aging Trends study (NHATS) to evaluate number of hospitalizations per year per person in subpopulation of persons with dementia with multiple chronic conditions who are followed over 3 waves/years of data (round). Follow up is one interview per year. My goal is to use a poisson model either a Multilevel/hierarchical model or a GEE.
Based on the NHATS technical paper 1&2:Round 1 (2011) of NHATS used a stratified three-stage sampling design:
svyset w1varunit [pweight=w1anfinwgt0], strat(w1varstrat)
stratum: w1varstrat and cluster: w1varunit are the variables used to allow users to compute variance estimate using Taylor series linearization.
w1anfinwgt0=sampling analytic weight
However when I run this with the code I see that the waves are not taken into consideration.
svyset w1varunit [pweight=w1anfinwgt0], strat(w1varstrat)
svy linearized, subpop(variable) : mepoisson dependent variable b1.i.independent variable covariates, irr
Output: with survey weights and strata
I know using just the sub sample that my data results should be more like this:
Output on subpopulation unweighted:
I have tried to modify the svyset command but I have been unsuccessful. If anyone knows how to incorporate my repeated measures of SPID. I would be extremely grateful.
These are other codes I have tried that didn't work
svyset w1varstrat [pw = w1anfinwgt0], strata(w1varunit)|| round, strata(spid),
Note: Stage 1 is sampled with replacement; further stages will be ignored for variance estimation.
pweight: w1anfinwgt0
VCE: linearized
Single unit: missing
Strata 1: w1varunit
SU 1: w1varstrat
FPC 1: <zero>
. svydes
Survey: Describing stage 1 sampling units
pweight: w1anfinwgt0
VCE: linearized
Single unit: missing
Strata 1: w1varunit
SU 1: w1varstrat
FPC 1: <zero>
#Obs per Unit
----------------------------
Stratum #Units #Obs min mean max
-------- -------- -------- -------- -------- --------
1 56 12,633 42 225.6 381
2 56 12,102 39 216.1 336
-------- -------- -------- -------- -------- --------
2 112 24,735 39 220.8 381
I am using the National Health and Aging Trends study (NHATS) to evaluate number of hospitalizations per year per person in subpopulation of persons with dementia with multiple chronic conditions who are followed over 3 waves/years of data (round). Follow up is one interview per year. My goal is to use a poisson model either a Multilevel/hierarchical model or a GEE.
Based on the NHATS technical paper 1&2:Round 1 (2011) of NHATS used a stratified three-stage sampling design:
- 95 PSUs, which are countries of groups of counties (using probability proportional size sampling).
- At the second stage, 655 SSUs within sampled PSUs, these are zip clusters from a sampling frame constructed from a 20% subsample of persons enrolled in medicare as of Sept 2010 who resided in the 95 PSUs sampled for NHATS(. Zip clusters were proportional size sampling.)
- Final stage was selection of beneficiaries within sampled zip cluster/SSU who were age 65 and older with oversamples of the oldest age groups and of black non-Hispanic persons. The probabilities of selection at each of the three stages were desinged to yield equal probability samples and targeted sample sizes by age group and race/ethnicity. The number of people (SPID) included is 8,245. My subpopulation is 909.
svyset w1varunit [pweight=w1anfinwgt0], strat(w1varstrat)
stratum: w1varstrat and cluster: w1varunit are the variables used to allow users to compute variance estimate using Taylor series linearization.
w1anfinwgt0=sampling analytic weight
However when I run this with the code I see that the waves are not taken into consideration.
svyset w1varunit [pweight=w1anfinwgt0], strat(w1varstrat)
svy linearized, subpop(variable) : mepoisson dependent variable b1.i.independent variable covariates, irr
Output: with survey weights and strata
Survey: Poisson regression | (running mepoisson on estimation sample) | |||
Number of strata = 56 | Number of obs | = | 23,140 | |
Number of PSUs = 112 | Population size | = | 105,193,875.00 | |
Subpop. no. obs | = | 1,927 | ||
Subpop. size | = | 6,939,508.90 | ||
Design df | = | 56 | ||
F( 22, 35) | = | 14.57 | ||
Prob > F | = |
Output on subpopulation unweighted:
GEE population-averaged model | Number of obs | 1,927 | |
Group variable: | spid | Number of groups | 909 |
Link: | log | Obs per group: | |
Family: | Poisson | min | 1 |
Correlation: | exchangeable | avg | 2.1 |
max | 3 | ||
Wald chi2(22) | 331.34 | ||
Scale parameter: | 1.3196 | Prob > chi2 |
These are other codes I have tried that didn't work
svyset w1varstrat [pw = w1anfinwgt0], strata(w1varunit)|| round, strata(spid),
Note: Stage 1 is sampled with replacement; further stages will be ignored for variance estimation.
pweight: w1anfinwgt0
VCE: linearized
Single unit: missing
Strata 1: w1varunit
SU 1: w1varstrat
FPC 1: <zero>
. svydes
Survey: Describing stage 1 sampling units
pweight: w1anfinwgt0
VCE: linearized
Single unit: missing
Strata 1: w1varunit
SU 1: w1varstrat
FPC 1: <zero>
#Obs per Unit
----------------------------
Stratum #Units #Obs min mean max
-------- -------- -------- -------- -------- --------
1 56 12,633 42 225.6 381
2 56 12,102 39 216.1 336
-------- -------- -------- -------- -------- --------
2 112 24,735 39 220.8 381
Comment