Hello,
I am a newbie in stata.
Currently I analyze a data set of influenza cases hospitalized in intensive care units in an administrative region. Among the variables studied, I have the type (A or B) and the virus subtype (A (H1N1) or A (H3N2) or B). If the percentage of missing data for the type is relatively small (3.2%) the percentage of missing data for the subtype is 32%.
As we know the distribution of subtypes for outpatients in the country for each season, is it correct to impute the type and subtype missing data with the proportion Virus A “pr_A” and the proportion of subtype Virus A (H1N1) found in general population?
Below Stata commands to implement the procedure
There are two variables: “typ” and “styp”:
gen Rtyp=cond(typ==.,1,0)
gen Rstyp=cond(sstyp==. ,1,0)
merge m:1 i_season using $chemin\outpatient.dta, keepusing(pr_A pr_h1n1)
replace typ=uniform()<=pr_A if Rtyp==1
replace styp=2 if Rstyp==1 & typ==0
replace styp=uniform()<=pr_h1n1 if Rstyp==1 & typ==1
Or, is it better to use multiple imputation with ICE ?
Thanks for your help.
Ronan.
I am a newbie in stata.
Currently I analyze a data set of influenza cases hospitalized in intensive care units in an administrative region. Among the variables studied, I have the type (A or B) and the virus subtype (A (H1N1) or A (H3N2) or B). If the percentage of missing data for the type is relatively small (3.2%) the percentage of missing data for the subtype is 32%.
As we know the distribution of subtypes for outpatients in the country for each season, is it correct to impute the type and subtype missing data with the proportion Virus A “pr_A” and the proportion of subtype Virus A (H1N1) found in general population?
Below Stata commands to implement the procedure
There are two variables: “typ” and “styp”:
- “typ” values: 0 for B virus, 1 for A virus
- “styp” values: 0 for A(H3N2), 1 for A(H1N1), 2 for B
gen Rtyp=cond(typ==.,1,0)
gen Rstyp=cond(sstyp==. ,1,0)
merge m:1 i_season using $chemin\outpatient.dta, keepusing(pr_A pr_h1n1)
replace typ=uniform()<=pr_A if Rtyp==1
replace styp=2 if Rstyp==1 & typ==0
replace styp=uniform()<=pr_h1n1 if Rstyp==1 & typ==1
Or, is it better to use multiple imputation with ICE ?
Thanks for your help.
Ronan.
Comment