Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generating age-sex standardized variables (-dstdize-?)

    Dear Statalisters,

    I apologize in advance because this might be a bit of a simple question, but I just can't seem to figure it out, so I hope someone is able to help. I am planning to do an ordered probit regression with a number of (more or less objective) health indicators as independent variables and subjective health status as dependent variable. However, since my dataset contains men and women between 50 and 65 years of age from 4 different countries, I want to standardize these objective health indicators by age and sex, yet I am not sure how to do this exactly.

    I thought about using -egen newvar = std(oldvar)- but this option cannot be combined with 'by' (nor with the bysort prefix), which means that it does not specifically standardize oldvar for age nor sex. The alternative would be -dstdize- or -idstdize- I think, but I am not sure whether this would be the appropriate command in my case. If I understand correctly, -dstdize- is typically used to calculate standardized mortality rates, but it does not generate a new variable. However, what I am trying to do is convert the current health variables into new variables that are standardized for age and sex, use these in an oprobit regression with subjective health as Y (incl. country dummies), and finally, generate a adjusted health index (via oprobit post-estimation).

    Does anyone have an idea whether -dstdize- would be a suitable command, and if it can be used to generate new (standardized) variables? If not: any ideas on a different approach?

    Thanks in advance.

  • #2
    You are right: dstdize calculates standardized rates; it does not generate any variables. Unfortunately I have no idea about what else to do.

    Comment


    • #3
      Thanks for your reply though! Hopefully someone else will have a tip.

      Comment


      • #4
        I do not completely understand what you are doing (-dstdize- and -egen newvar=std(oldvar)- seem different to me), but, although you can't use "by" with the -egen- std function, you can use "if"; so, why not make several new variables and then turn them into one variable that you want (assuming only one non-missing newvar per row, you could use -egen- with one of its "row" functions)

        Comment


        • #5
          Have you thought of using egen newvar = std(oldvar), but by parts?
          Perhaps something like
          Code:
          gen new=.
          forvalues age=45/55 {
          capture drop x
          egen x= std(oldvar)  if age==`age' & sex==1
          replace new=x if age==`age' & sex==1
          capture drop x
          egen x= std(oldvar)  if age==`age' & sex==2
          replace new=x if age==`age' & sex==2
          }
          Might not be the most elegant solution, but might work.
          HTH

          Comment


          • #6
            There is confusion in this post between the statistical and epidemiological uses of the term "standardize." The -egen, std()- command does statistical standardization: it calculates the value of a variable centered to 0 and scaled to a standard deviation of 1. The command -dstdize- does epidemiological standardization: it calculates counterfactual rates of occurrence of events in a hypothetical or real standardized population based on stratum-specific observed rates.

            From the overall thrust of the original post, I think that Daniela wants to do epidemiological standardization. -dstdize- will get her what she needs, but it does not save its results in new variables, as she notes (nor does it leave that information behind in r()). When all is said and done, standardized rates are just stratum-size-weighted averages of the observed rates. So it's actually fairly easy to do these calculations with some simple commands. But the details of the actual data structure matter. So I would suggest that Daniela post some representative data that she has (paste the output of a suitable -list- command into a code block) and explain what the variables in question are, and probably we can make progress from there. Also, be sure to specify what your standard age-sex distribution is and where you have it as a data file.

            Comment

            Working...
            X