Hi,
The example given and attached are simulated datasets. The original data is longitudinal (approximately 100k observations) spanning across multiple years, where an individual (age>=35) with COPD diagnosis remains in the study sample until they leave the region or die. I am trying to estimate population at risk, i.e. age>=35 and without COPD diagnosis. I can do it in the excel by estimating prevalence and population separately. I would have done but I need to do multiple crosstabs e.g., age, gender, age-gender, age-gender-health region. Thus, I tried estimate the totals (prevalence, population at risk) and then export the numbers. In the codes below:
I managed to estimate population at risk for each year, but I still couldn't figure it out for age and other cross-tabulations. I tried loops, referring years and age categories similar to overall "population risk". But either I don't refer to the right combination or the result is dot due to missing values.
I may have to calculate it in excel, but thought I would ask before doing as the task is quite time consuming. The simulate data has single observation per person unlike the original data.
Thanks in advance.
The example given and attached are simulated datasets. The original data is longitudinal (approximately 100k observations) spanning across multiple years, where an individual (age>=35) with COPD diagnosis remains in the study sample until they leave the region or die. I am trying to estimate population at risk, i.e. age>=35 and without COPD diagnosis. I can do it in the excel by estimating prevalence and population separately. I would have done but I need to do multiple crosstabs e.g., age, gender, age-gender, age-gender-health region. Thus, I tried estimate the totals (prevalence, population at risk) and then export the numbers. In the codes below:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input long id str1 gender byte healthregion float(copd year age2012 age2013 age2014 category_age2012 category_age2013 category_age2014) 1 "F" 3 . 2012 80.25 81.25 82.25 4 4 4 2 "" . . . 86.5 87.5 88.5 4 4 4 3 "M" 1 1 2013 20 21 22 1 1 1 4 "" . . . 50.75 51.75 52.75 3 3 3 5 "" . . . 64.333336 65.333336 66.333336 4 4 4 6 "" . . . 106.91666 107.91666 108.91666 4 4 4 7 "" . . . 63.25 64.25 65.25 3 4 4 8 "" . . . 28 29 30 1 1 1 9 "M" 2 . 2012 48.08333 49.08333 50.08333 2 3 3 10 "" . . . 63.25 64.25 65.25 3 4 4 11 "" . . . 45.5 46.5 47.5 2 2 2 12 "" . . . 59.41667 60.41667 61.41667 3 3 3 13 "M" 2 1 2012 77.66666 78.66666 79.66666 4 4 4 14 "" . . . 35.083332 36.083332 37.083332 2 2 2 15 "M" 1 1 2012 7.083333 8.083333 9.083333 1 1 1 16 "" . . . 5 6 7 1 1 1 17 "" . . . 43.91667 44.91667 45.91667 2 2 2 18 "" . . . 15.166667 16.166666 17.166666 1 1 1 19 "" . . . 47.33333 48.33333 49.33333 2 2 3 20 "" . . . 55.25 56.25 57.25 3 3 3 21 "" . . . 62.25 63.25 64.25 3 3 4 22 "" . . . 51.58333 52.58333 53.58333 3 3 3 23 "" . . . 28.5 29.5 30.5 1 1 1 24 "F" 1 1 2012 107.16666 108.16666 109.16666 4 4 4 25 "M" 2 1 2012 26.583334 27.583334 28.583334 1 1 1 26 "F" 1 . 2012 80.58334 81.58334 82.58334 4 4 4 27 "" . . . 13.666667 14.666667 15.666667 1 1 1 28 "" . . . 55.58333 56.58333 57.58333 3 3 3 29 "" . . . 3.083333 4.0833335 5.083333 1 1 1 30 "" . . . 8.583333 9.583333 10.583333 1 1 1 31 "" . . . 29.916666 30.916666 31.916666 1 1 1 32 "F" 2 1 2012 58.91667 59.91667 60.91667 3 3 3 33 "" . . . 49.83333 50.83333 51.83333 3 3 3 34 "F" 1 1 2012 41.08333 42.08333 43.08333 2 2 2 35 "M" 4 1 2012 41.75 42.75 43.75 2 2 2 36 "" . . . 31.25 32.25 33.25 1 1 1 37 "" . . . 39.66667 40.66667 41.66667 2 2 2 38 "" . . . 27.333334 28.333334 29.333334 1 1 1 39 "" . . . 34.416668 35.416668 36.416668 1 2 2 40 "" . . . 49.83333 50.83333 51.83333 3 3 3 41 "" . . . 70.333336 71.333336 72.333336 4 4 4 42 "" . . . 49.83333 50.83333 51.83333 3 3 3 43 "" . . . 29.416666 30.416666 31.416666 1 1 1 44 "" . . . 26.25 27.25 28.25 1 1 1 45 "" . . . 19.916666 20.916666 21.916666 1 1 1 46 "" . . . 101.5 102.5 103.5 4 4 4 47 "" . . . 62 63 64 3 3 3 48 "M" 1 1 2012 36.916668 37.916668 38.91667 2 2 2 49 "" . . . 18.916666 19.916666 20.916666 1 1 1 50 "" . . . 43.5 44.5 45.5 2 2 2 51 "" . . . 49.16667 50.16667 51.16667 3 3 3 52 "" . . . 48.66667 49.66667 50.66667 2 3 3 53 "" . . . 56.83333 57.83333 58.83333 3 3 3 54 "M" 5 . 2012 50.08333 51.08333 52.08333 3 3 3 55 "F" 4 1 2012 56.91667 57.91667 58.91667 3 3 3 56 "F" 7 . 2012 23.166666 24.166666 25.166666 1 1 1 57 "F" 2 1 2012 4.75 5.75 6.75 1 1 1 58 "" . . . 109.16666 110.16666 111.16666 4 4 4 59 "F" 1 . 2012 86.58334 87.58334 88.58334 4 4 4 60 "" . . . 63.41667 64.416664 65.416664 3 4 4 61 "" . . . 85.25 86.25 87.25 4 4 4 62 "" . . . 21.166666 22.166666 23.166666 1 1 1 63 "" . . . 72.416664 73.416664 74.416664 4 4 4 64 "M" 5 . 2012 36.583332 37.583332 38.58333 2 2 2 65 "" . . . 60.58333 61.58333 62.58333 3 3 3 66 "" . . . 62.66667 63.66667 64.666664 3 3 4 67 "F" 2 1 2012 58.41667 59.41667 60.41667 3 3 3 68 "" . . . 54.66667 55.66667 56.66667 3 3 3 69 "" . . . 17.25 18.25 19.25 1 1 1 70 "" . . . 65 66 67 4 4 4 71 "" . . . 26.666666 27.666666 28.666666 1 1 1 72 "" . . . 35.333332 36.333332 37.333332 2 2 2 73 "" . . . 18.25 19.25 20.25 1 1 1 74 "" . . . 44.16667 45.16667 46.16667 2 2 2 75 "" . . . 62.5 63.5 64.5 3 3 4 76 "" . . . 57.33333 58.33333 59.33333 3 3 3 77 "M" 5 1 2012 55.91667 56.91667 57.91667 3 3 3 78 "M" 2 . 2012 57.08333 58.08333 59.08333 3 3 3 79 "" . . . 35.75 36.75 37.75 2 2 2 80 "M" 6 . 2012 42.58333 43.58333 44.58333 2 2 2 81 "M" 7 1 2013 37.333332 38.33333 39.33333 2 2 2 82 "" . . . 79.58334 80.58334 81.58334 4 4 4 83 "" . . . 26.333334 27.333334 28.333334 1 1 1 84 "" . . . 47.08333 48.08333 49.08333 2 2 3 85 "" . . . 72 73 74 4 4 4 86 "" . . . 43 44 45 2 2 2 87 "" . . . 62.91667 63.91667 64.916664 3 3 4 88 "" . . . 33.833332 34.833332 35.833332 1 1 2 89 "" . . . 37.333332 38.33333 39.33333 2 2 2 90 "F" 5 . 2012 41.33333 42.33333 43.33333 2 2 2 91 "" . . . 54.41667 55.41667 56.41667 3 3 3 92 "M" 3 1 2012 19.75 20.75 21.75 1 1 1 93 "" . . . 55.58333 56.58333 57.58333 3 3 3 94 "" . . . 77.83334 78.83334 79.83334 4 4 4 95 "M" 7 1 2012 94.83334 95.83334 96.83334 4 4 4 96 "" . . . 2.25 3.25 4.25 1 1 1 97 "" . . . 69.916664 70.916664 71.916664 4 4 4 98 "" . . . 57.83333 58.83333 59.83333 3 3 3 99 "" . . . 73.833336 74.833336 75.833336 4 4 4 100 "" . . . 26.75 27.75 28.75 1 1 1 end label values category_age2012 few_cat label values category_age2013 few_cat label values category_age2014 few_cat label def few_cat 1 "Under 35", modify label def few_cat 2 "35-49", modify label def few_cat 3 "50-64", modify label def few_cat 4 "65", modify
I may have to calculate it in excel, but thought I would ask before doing as the task is quite time consuming. The simulate data has single observation per person unlike the original data.
Thanks in advance.
Code:
*Prevalence* ************ clear* use copd_poprisk *Overall* forvalues yr = 2012(1)2014 { bys year: egen prev_`yr' = total(copd) if age`yr'>35 & year == `yr' } forvalues yr = 2012(1)2014 { sort year prev_`yr' tabdisp year if year == `yr', cell (prev_`yr') } *Age* *Prevalence* forvalues yr = 2012(1)2014 { bys category_age`yr' year : egen prev_age_`yr' = total(copd) if age`yr'>35 & year == `yr' } forvalues yr = 2012(1)2014 { sort year category_age`yr' prev_age_`yr' tabdisp category_age`yr' if year ==`yr', cell (prev_age_`yr') } *Population at risk = Pop without COPD of previous year - prevalence of current year ************************************************************************************************* *Population* ************ *Overall* gen pop = 1 forvalues yr = 2012(1)2014 { bys year: egen pop_`yr' = total(pop) if age`yr'>35 & year == `yr' } forvalues yr = 2012(1)2014 { sort year pop_`yr' tabdisp year if year == `yr', cell (pop_`yr') } *Age* forvalues yr = 2012(1)2014 { bys category_age`yr' year : egen pop_age_`yr' = total(pop) if age`yr'>35 & year == `yr' } forvalues yr = 2012(1)2014 { sort year category_age`yr' pop_age_`yr' tabdisp category_age`yr' if year ==`yr', cell (pop_age_`yr') } *Population at Risk* ******************** forvalues yr = 2012(1)2014 { sort prev_`yr' replace prev_`yr' = prev_`yr'[_n-1] if prev_`yr' == . } forvalues yr = 2012(1)2014 { sort pop_`yr' replace pop_`yr' = pop_`yr'[_n-1] if pop_`yr' == . } *Overall* gen pop_risk_2013 = pop_2012 - prev_2013 if year == 2013 gen pop_risk_2014 = pop_2013 - prev_2014 if year == 2014
Comment