Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Standardizing Rates to Population of Choice

    Dear All,

    I have a dataset containing data from people in many regions and I want to calculate age-standardized rates using a standard population as reference.

    I have two major things I need help to fix: the first is deriving rates when each line of data refers to just one person and not an age-group. The second is about inputing the population of choice.

    The data looks like this:

    ID age-group_per region age-group_pop dead

    6 5-9 a 2000 0
    7 10-14 b 3000 1
    9 20-24 a 1000 0
    10 30-34 b 6000 1
    11 10-14 c 6000 0
    12 10-14 b 3000 0
    13 5-9 a 2000 0
    14 20-24 c 1000 1
    15 20-24 b 2000 1

    age-group_per is the age-group each individual falls into
    region is the region each one lives
    age-group_pop is the population of that age group in that region
    dead is the mortality of each individual

    To use the standardization commands, I will expect that the total number of people that died in each age-group for each region would have been already calculated as a separate variable (died) and in that case, it would be straightfoward to get a rate eg.

    age-group region age-group_pop died
    5 -9 a 2000 3
    20-24 a 1000 2
    10-14 b 3000 2
    20-24 b 2000 3
    10-14 c 1000 2
    20-24 c 1000 2


    but it is not so in this case.

    How do you suggest it can be formatted to make calculating a rate (died/age-group_pop) possible and standaridization easy?

    An off-shoot question is how do I incorporate the "standard population" if it is not one of the populations being examined?

    I would appreciate comments.


    Thanks a lot.


    Bode.

  • #2
    You need two commands, collapse and dstdize. With collapse you can create the collapsed table you want. Note, however:
    1. age-group is not a valid variable name; agegroup is.
    2. If agegroup is a string variable, you will probably need to replace it by a numeric variable. At least, agegroup must have the same name and the same coding in your dataset and in the standard population file.
    3. You don't tell the time-at-risk for the events. You need that to calculate rates.

    See help dstdize on how to incorporate the standard population.

    Comment


    • #3
      Thank you Svend. This was very helpful.

      Comment


      • #4
        The previous suggestions were excellent ideas. I have been searching for a solution to a more recent development.

        I have several populations to standardize in the same dataset and I was wondering if there is a way to incorporate the standardized rates for each population into the dataset. The -saving- option, which I have attepmted to use, does not seem to provide that utility (or perharps I am not using it right).
        eg. dstdize died agegroup_pop agegroup, by(region) saving("E:\Analyses\Population Studies\standardized.dta")

        I want the new dataset to have an additional variable: "Adj_Rate"

        Is there any other command you can please suggest to perform this function?

        Thanks for all your help.

        Comment


        • #5
          I don't understand the purpose of the saving() option of dstdize which requires the standard population distribution as input. I experimented, and the saving() option just produced a copy of the population distribution I had provided.

          You have "several populations" to standardize in the same dataset. When you want to incorporate the "standardized rates" for each population, I assume that you mean standard population distribution; dstdize does not use standardized rates; it produces them. The by() option of dstdize is probably what you need.

          dstdize produces a single adjusted rate for each population specified by the by() option; I don't know how you would want to include it in the dataset.

          Comment


          • #6
            Thank you Svend. The point you raised about the dstdize command producing rates is the exact reason I am trying out options. Because they are so many populations and the standardized rates would be used in further analyses, it is cumbersome imputing the rates gotten from the output into the dataset.

            A method I realize could work going through the help files is manually "standardizing" the rates by merging the standard population with the dataset then multiplying the rates by the standard populations proportions.

            gen crude = died/agegroup_pop
            gen product = crude*standard_agegroup_proportion
            by region, sort: egen adjusted = sum(product)

            It seems like the best option now unless there are other thoughts.

            Thanks.

            Comment


            • #7
              Bode: Take a look in the manual, stored results under dstdize; many results are stored and hence available for subsequent analysis.

              Comment


              • #8
                Thank you Svend.

                Comment

                Working...
                X