I'm quite sure you didn't intend to waste my time. And I apologize if I was too harsh in my reaction. The lesson to learn is that in programming, the details are extremely important. Something that one thinks is "sufficiently similar" to the real data may, as was the case here, be different in ways that break the code.
Anyway, the following should work:
Anyway, the following should work:
Code:
// CLEAN THE EURO STANDARD POPULATION // AND MODIFY ITS CONTENT TO SUPPORT // MATCHING WITH THE OTHER DATA * Example generated by -dataex-. To install: ssc install dataex clear input str8 agegroup str3 sex str7 ESP2013 "agegroup" "sex" "ESP2013" "0-4" "M" "5000" "0-4" "F" "5000" "5-9" "M" "5500" "5-9" "F" "5500" "10-14" "M" "5500" "10-14" "F" "5500" "15-19" "M" "5500" "15-19" "F" "5500" "20-24" "M" "6000" "20-24" "F" "6000" "25-29" "M" "6000" "25-29" "F" "6000" "30-34" "M" "6500" "30-34" "F" "6500" "35-39" "M" "7000" "35-39" "F" "7000" "40-44" "M" "7000" "40-44" "F" "7000" "45-49" "M" "7000" "45-49" "F" "7000" "50-54" "M" "7000" "50-54" "F" "7000" "55-59" "M" "6500" "55-59" "F" "6500" "60-64" "M" "6000" "60-64" "F" "6000" "65-69" "M" "5500" "65-69" "F" "5500" "70-74" "M" "5000" "70-74" "F" "5000" "75-79" "M" "4000" "75-79" "F" "4000" "80-84" "M" "2500" "80-84" "F" "2500" "85-89" "M" "1500" "85-89" "F" "1500" "90+" "M" "1000" "90+" "F" "1000" end // REMOVE FIRST OBSERVATION WHICH CONTAINS VARIABLE NAMES drop in 1 // CONVERT ESP2013 TO NUMERIC VARIABLE destring ESP2013, replace // COMBINE 85-89 AND 90+ AGE GROUPS // SO AS TO MATCH WITH GLIOMA DATA, replace agegroup = "85+" if inlist(agegroup, "85-89", "90+") collapse (sum) ESP2013, by(agegroup sex) // AND NUMERICALLY ENCODE IT TO MATCH THE USAGE // IN THE GLIOMA DATA, RENAMING IT TO age_group label def AgeCat 1 "0-4", modify label def AgeCat 2 "5-9", modify label def AgeCat 3 "10-14", modify label def AgeCat 4 "15-19", modify label def AgeCat 5 "20-24", modify label def AgeCat 6 "25-29", modify label def AgeCat 7 "30-34", modify label def AgeCat 8 "35-39", modify label def AgeCat 9 "40-44", modify label def AgeCat 10 "45-49", modify label def AgeCat 11 "50-54", modify label def AgeCat 12 "55-59", modify label def AgeCat 13 "60-64", modify label def AgeCat 14 "65-69", modify label def AgeCat 15 "70-74", modify label def AgeCat 16 "75-79", modify label def AgeCat 17 "80-84", modify label def AgeCat 18 "85+", modify encode agegroup, gen(age_group) label(AgeCat) drop agegroup // NUMERICALLY ENCODE SEX TO MATCH USAGE // IN GLIOMA DATA label def sex 1 "F", modify label def sex 2 "M", modify rename sex _sex encode _sex, gen(sex) label(sex) drop _sex quietly compress tempfile euro_standard save `euro_standard' // NOW BRING IN THE GLIOMA DATA * Example generated by -dataex-. To install: ssc install dataex clear input int(dg_y age_group) float sex byte allglioma long pop float(Agecat Agecat2 year Cat year5 Cat2 inc_rate) 1971 8 1 2 134865 2 2 1 19712 1 12 1.4829644 1976 4 1 3 190273 1 1 6 19761 2 21 1.576682 1987 13 1 14 137021 4 4 17 19874 4 44 10.217412 1984 13 1 14 138845 4 4 14 19844 3 34 10.083186 2005 5 1 2 163226 2 2 35 20052 8 82 1.225295 1990 16 1 8 92919 4 4 20 19904 5 54 8.60965 1975 17 1 0 30437 5 4 5 19754 2 24 0 2005 16 1 5 109611 4 4 35 20054 8 84 4.561586 1984 12 1 13 141390 3 3 14 19843 3 33 9.194427 1980 11 1 13 144179 3 3 10 19803 3 33 9.01657 1985 7 1 13 197817 2 2 15 19852 4 42 6.571731 2006 3 1 4 158679 1 1 36 20061 8 81 2.5208125 2005 13 1 19 152460 4 4 35 20054 8 84 12.462285 1980 6 1 8 196535 2 2 10 19802 3 32 4.070522 1976 11 1 12 145267 3 3 6 19763 2 23 8.260651 1998 5 1 7 159511 2 2 28 19982 6 62 4.388412 2002 17 1 1 74808 5 4 32 20024 7 74 1.3367554 1985 15 2 6 65008 4 4 15 19854 4 44 9.229633 2009 2 2 2 147367 1 1 39 20091 8 81 1.357156 2010 16 2 10 75178 4 4 40 20104 9 94 13.301764 2002 12 2 20 181388 3 3 32 20023 7 73 11.026088 1989 4 2 1 153928 1 1 19 19891 4 41 .6496544 end label values age_group AgeCat label def AgeCat 1 "0-4", modify label def AgeCat 2 "5-9", modify label def AgeCat 3 "10-14", modify label def AgeCat 4 "15-19", modify label def AgeCat 5 "20-24", modify label def AgeCat 6 "25-29", modify label def AgeCat 7 "30-34", modify label def AgeCat 8 "35-39", modify label def AgeCat 9 "40-44", modify label def AgeCat 10 "45-49", modify label def AgeCat 11 "50-54", modify label def AgeCat 12 "55-59", modify label def AgeCat 13 "60-64", modify label def AgeCat 14 "65-69", modify label def AgeCat 15 "70-74", modify label def AgeCat 16 "75-79", modify label def AgeCat 17 "80-84", modify label def AgeCat 18 "85+", modify label values sex sex label def sex 1 "Female", modify label def sex 2 "male", modify label values year years label def years 0 "1970", modify label def years 1 "1971", modify label def years 2 "1972", modify label def years 3 "1973", modify label def years 4 "1974", modify label def years 5 "1975", modify label def years 6 "1976", modify label def years 7 "1977", modify label def years 8 "1978", modify label def years 9 "1979", modify label def years 10 "1980", modify label def years 11 "1981", modify label def years 12 "1982", modify label def years 13 "1983", modify label def years 14 "1984", modify label def years 15 "1985", modify label def years 16 "1986", modify label def years 17 "1987", modify label def years 18 "1988", modify label def years 19 "1989", modify label def years 20 "1990", modify label def years 21 "1991", modify label def years 22 "1992", modify label def years 23 "1993", modify label def years 24 "1994", modify label def years 25 "1995", modify label def years 26 "1996", modify label def years 27 "1997", modify label def years 28 "1998", modify label def years 29 "1999", modify label def years 30 "2000", modify label def years 31 "2001", modify label def years 32 "2002", modify label def years 33 "2003", modify label def years 34 "2004", modify label def years 35 "2005", modify label def years 36 "2006", modify label def years 37 "2007", modify label def years 38 "2008", modify label def years 39 "2009", modify label def years 40 "2010", modify label def years 41 "2011", modify label def years 42 "2012", modify label def years 43 "2013", modify // STRATUM INCIDENCE RATES ARE ALREADY PRESENT IN THIS DATA // SO THEY DO NOT NEED TO BE RE-CALCULATED // MERGE WITH EUROPEAN STANDARD POPULATION DATA merge m:1 age_group sex using `euro_standard' // IF THERE ARE STRATUM GAPS IN ANY YEAR, // IMPUTE INCIDENCE RATE OF ZERO TO THOSE replace inc_rate = 0 if _merge == 2 // CALCULATE AGE-SEX STANDARDIZED RATES BY YEAR collapse (mean) inc_rate [fweight = pop], by(year) label var inc_rate "Age-sex adjusted incidence per 100,000 population" format inc_rate %3.2f list, noobs
Comment