Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I'm quite sure you didn't intend to waste my time. And I apologize if I was too harsh in my reaction. The lesson to learn is that in programming, the details are extremely important. Something that one thinks is "sufficiently similar" to the real data may, as was the case here, be different in ways that break the code.

    Anyway, the following should work:

    Code:
    //    CLEAN THE EURO STANDARD POPULATION
    //    AND MODIFY ITS CONTENT TO SUPPORT
    //    MATCHING WITH THE OTHER DATA
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 agegroup str3 sex str7 ESP2013
    "agegroup" "sex" "ESP2013"
    "0-4"      "M"   "5000"   
    "0-4"      "F"   "5000"   
    "5-9"      "M"   "5500"   
    "5-9"      "F"   "5500"   
    "10-14"    "M"   "5500"   
    "10-14"    "F"   "5500"   
    "15-19"    "M"   "5500"   
    "15-19"    "F"   "5500"   
    "20-24"    "M"   "6000"   
    "20-24"    "F"   "6000"   
    "25-29"    "M"   "6000"   
    "25-29"    "F"   "6000"   
    "30-34"    "M"   "6500"   
    "30-34"    "F"   "6500"   
    "35-39"    "M"   "7000"   
    "35-39"    "F"   "7000"   
    "40-44"    "M"   "7000"   
    "40-44"    "F"   "7000"   
    "45-49"    "M"   "7000"   
    "45-49"    "F"   "7000"   
    "50-54"    "M"   "7000"   
    "50-54"    "F"   "7000"   
    "55-59"    "M"   "6500"   
    "55-59"    "F"   "6500"   
    "60-64"    "M"   "6000"   
    "60-64"    "F"   "6000"   
    "65-69"    "M"   "5500"   
    "65-69"    "F"   "5500"   
    "70-74"    "M"   "5000"   
    "70-74"    "F"   "5000"   
    "75-79"    "M"   "4000"   
    "75-79"    "F"   "4000"   
    "80-84"    "M"   "2500"   
    "80-84"    "F"   "2500"   
    "85-89"    "M"   "1500"   
    "85-89"    "F"   "1500"   
    "90+"      "M"   "1000"   
    "90+"      "F"   "1000"   
    end
    //    REMOVE FIRST OBSERVATION WHICH CONTAINS VARIABLE NAMES
    drop in 1
    
    //    CONVERT ESP2013 TO NUMERIC VARIABLE
    destring ESP2013, replace
    
    //    COMBINE 85-89 AND 90+ AGE GROUPS
    //    SO AS TO MATCH WITH GLIOMA DATA,
    replace agegroup = "85+" if inlist(agegroup, "85-89", "90+")
    collapse (sum) ESP2013, by(agegroup sex)
    //    AND NUMERICALLY ENCODE IT TO MATCH THE USAGE
    //    IN THE GLIOMA DATA, RENAMING IT TO age_group
    label def AgeCat 1 "0-4", modify
    label def AgeCat 2 "5-9", modify
    label def AgeCat 3 "10-14", modify
    label def AgeCat 4 "15-19", modify
    label def AgeCat 5 "20-24", modify
    label def AgeCat 6 "25-29", modify
    label def AgeCat 7 "30-34", modify
    label def AgeCat 8 "35-39", modify
    label def AgeCat 9 "40-44", modify
    label def AgeCat 10 "45-49", modify
    label def AgeCat 11 "50-54", modify
    label def AgeCat 12 "55-59", modify
    label def AgeCat 13 "60-64", modify
    label def AgeCat 14 "65-69", modify
    label def AgeCat 15 "70-74", modify
    label def AgeCat 16 "75-79", modify
    label def AgeCat 17 "80-84", modify
    label def AgeCat 18 "85+", modify
    encode agegroup, gen(age_group) label(AgeCat)
    drop agegroup
    
    //    NUMERICALLY ENCODE SEX TO MATCH USAGE
    //    IN GLIOMA DATA
    label def sex 1 "F", modify
    label def sex 2 "M", modify
    rename sex _sex
    encode _sex, gen(sex) label(sex)
    drop _sex
    
    quietly compress
    tempfile euro_standard
    save `euro_standard'
    
    
    
    // NOW BRING IN THE GLIOMA DATA
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(dg_y age_group) float sex byte allglioma long pop float(Agecat Agecat2 year Cat year5 Cat2 inc_rate)
    1971  8 1  2 134865 2 2  1 19712 1 12 1.4829644
    1976  4 1  3 190273 1 1  6 19761 2 21  1.576682
    1987 13 1 14 137021 4 4 17 19874 4 44 10.217412
    1984 13 1 14 138845 4 4 14 19844 3 34 10.083186
    2005  5 1  2 163226 2 2 35 20052 8 82  1.225295
    1990 16 1  8  92919 4 4 20 19904 5 54   8.60965
    1975 17 1  0  30437 5 4  5 19754 2 24         0
    2005 16 1  5 109611 4 4 35 20054 8 84  4.561586
    1984 12 1 13 141390 3 3 14 19843 3 33  9.194427
    1980 11 1 13 144179 3 3 10 19803 3 33   9.01657
    1985  7 1 13 197817 2 2 15 19852 4 42  6.571731
    2006  3 1  4 158679 1 1 36 20061 8 81 2.5208125
    2005 13 1 19 152460 4 4 35 20054 8 84 12.462285
    1980  6 1  8 196535 2 2 10 19802 3 32  4.070522
    1976 11 1 12 145267 3 3  6 19763 2 23  8.260651
    1998  5 1  7 159511 2 2 28 19982 6 62  4.388412
    2002 17 1  1  74808 5 4 32 20024 7 74 1.3367554
    1985 15 2  6  65008 4 4 15 19854 4 44  9.229633
    2009  2 2  2 147367 1 1 39 20091 8 81  1.357156
    2010 16 2 10  75178 4 4 40 20104 9 94 13.301764
    2002 12 2 20 181388 3 3 32 20023 7 73 11.026088
    1989  4 2  1 153928 1 1 19 19891 4 41  .6496544
    end
    label values age_group AgeCat
    label def AgeCat 1 "0-4", modify
    label def AgeCat 2 "5-9", modify
    label def AgeCat 3 "10-14", modify
    label def AgeCat 4 "15-19", modify
    label def AgeCat 5 "20-24", modify
    label def AgeCat 6 "25-29", modify
    label def AgeCat 7 "30-34", modify
    label def AgeCat 8 "35-39", modify
    label def AgeCat 9 "40-44", modify
    label def AgeCat 10 "45-49", modify
    label def AgeCat 11 "50-54", modify
    label def AgeCat 12 "55-59", modify
    label def AgeCat 13 "60-64", modify
    label def AgeCat 14 "65-69", modify
    label def AgeCat 15 "70-74", modify
    label def AgeCat 16 "75-79", modify
    label def AgeCat 17 "80-84", modify
    label def AgeCat 18 "85+", modify
    label values sex sex
    label def sex 1 "Female", modify
    label def sex 2 "male", modify
    label values year years
    label def years 0 "1970", modify
    label def years 1 "1971", modify
    label def years 2 "1972", modify
    label def years 3 "1973", modify
    label def years 4 "1974", modify
    label def years 5 "1975", modify
    label def years 6 "1976", modify
    label def years 7 "1977", modify
    label def years 8 "1978", modify
    label def years 9 "1979", modify
    label def years 10 "1980", modify
    label def years 11 "1981", modify
    label def years 12 "1982", modify
    label def years 13 "1983", modify
    label def years 14 "1984", modify
    label def years 15 "1985", modify
    label def years 16 "1986", modify
    label def years 17 "1987", modify
    label def years 18 "1988", modify
    label def years 19 "1989", modify
    label def years 20 "1990", modify
    label def years 21 "1991", modify
    label def years 22 "1992", modify
    label def years 23 "1993", modify
    label def years 24 "1994", modify
    label def years 25 "1995", modify
    label def years 26 "1996", modify
    label def years 27 "1997", modify
    label def years 28 "1998", modify
    label def years 29 "1999", modify
    label def years 30 "2000", modify
    label def years 31 "2001", modify
    label def years 32 "2002", modify
    label def years 33 "2003", modify
    label def years 34 "2004", modify
    label def years 35 "2005", modify
    label def years 36 "2006", modify
    label def years 37 "2007", modify
    label def years 38 "2008", modify
    label def years 39 "2009", modify
    label def years 40 "2010", modify
    label def years 41 "2011", modify
    label def years 42 "2012", modify
    label def years 43 "2013", modify
    // STRATUM INCIDENCE RATES ARE ALREADY PRESENT IN THIS DATA
    //    SO THEY DO NOT NEED TO BE RE-CALCULATED
    
    //    MERGE WITH EUROPEAN STANDARD POPULATION DATA
    merge m:1 age_group sex using `euro_standard'
    //    IF THERE ARE STRATUM GAPS IN ANY YEAR,
    //    IMPUTE INCIDENCE RATE OF ZERO TO THOSE
    replace inc_rate = 0 if _merge == 2
    
    //    CALCULATE AGE-SEX STANDARDIZED RATES BY YEAR
    collapse (mean) inc_rate [fweight = pop], by(year)
    label var inc_rate "Age-sex adjusted incidence per 100,000 population"
    format inc_rate %3.2f
    
    list, noobs

    Comment


    • #17
      Thank you so much Clyde. Now, I have Age_sex standardized incidence rates.Was not it [fweight=ESP2013] instead of [fweight=pop]?? in the above command.
      collapse (mean) inc_rate [fweight = ESP2013], by(year) So, to calculate age standardized incidence rates among men and women and for 4 broad age categories, I used command below:
      Is not it?
      collapse (mean) inc_rate [fweight = ESP2013], by(year sex age Agecat2) Thank you Clyde, I really appreciate your help.

      Comment


      • #18
        You are correct, it is [fweight = ESP2013]. Sorry for the error.

        Comment


        • #19
          I don't think
          Code:
          collapse (mean) inc_rate [fweight = ESP2013], by(year sex age Agecat2)
          is quite right. There is no variable called age in your data, and Stata will interpret it to mean age_group, the only variable whose name starts with age. So you will end up getting your data aggregated by the original age groups (which it appeares are nested in Agecat2), not by Agecat2 itself. So I would take "age" out of the -by()- option.

          Comment


          • #20
            Yeah, I did not mean to include age, it was an error.And thank you once again.

            Comment

            Working...
            X