Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estimate risk for groups

    Ciao Here is the example using Stata data.

    Code:
    use http://www.stata-press.com/data/r13/drugtr
     stset studytime, failure(died)
    stcox drug age
    Now if we want to get HR risk for subjects less than 56 or more than 56 we make new variable
    Code:
    gen lessthan56 = age
    recode lessthan56 (min/55 = 0)
    recode lessthan56 (56/max = 1)
    But how is it possible to get HR risk for these two groups?

  • #2
    Your code for creating lessthan56 is far more complicated than it needs to be. See -help fvvarlist- to learn about factor variable notation. Then read https://www3.nd.edu/~rwilliam/stats/Margins01.pdf, from the excellent Richard Williams, for a crystal clear introduction to the amazing -margins- command.

    Code:
    gen byte lessthan56 = (age < 56) if !missing(age)
    stcox drug i.lessthan56
    margins lessthan56
    All of that said, unless there is a clear theoretical reason to believe that the hazard ratio changes abruptly at age 56, dichotomizing age in this way is a bad idea. It discards information (it treats a 55 year old as radically different from a 56 year old but that same 55 year old is the same as a 5 year old) and may introduce bias into the analysis. Categorizing inherently continuous variables can be helpful for displaying descriptive statistics, but such categorizations are usually unsuitable for analysis.

    Comment


    • #3
      Reading Clyde's as usual excellent reply, the following reference comes to my mind: https://www.ncbi.nlm.nih.gov/pubmed/16217841.
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Clyde Schechter Carlo Lazzaro Thanks a bunch. I am actually just using this to explore analysis in Stata and thank you very much for your helpful comments. Margin command outputs the HR risk for both groups is that true? I see the Margin value is close to the HR in the stcox model however the confidence intervals vary greatly. Is that supposed to be that way? Lastly is it acceptable to run the stcox model for each group in separate models?
        Last edited by sladmin; 17 Oct 2019, 08:17. Reason: anonymize original poster

        Comment


        • #5
          Your model contains variables other than age, and the output from -margins- gives you hazard ratios that are adjusted to the overall sample distribution of the other variables. That is why they differ somewhat from the direct output of -stcox-. The standard errors are also calculated in different ways.

          Lastly is it acceptable to run the stcox model for each group in separate models?
          Acceptable for what purpose?

          Comment


          • #6
            Clyde Schechter Thanks a bunch. Is that an acceptable way to get the HR risk for both groups?

            Comment


            • #7
              If you were to run -stcox drug i.age_group-, with just two age groups, you will get one hazard ratio, not two, for age, because one category becomes the reference, and the hazard ratio you get will give you the ratio of the hazards of one age group relative to the other. So there really aren't going to be two hazard ratios to compare if you do this (except in the rather trivial sense that one of them is arbitrarily constrained to 1.)

              If you run -stcox drug i.age_group if age_group == 0- and again for 1, you will get no results at all for age_group, because in each of these models, age_group will be a constant, and so will be omitted from the model due to colinearity with the constant term. So, again, nothing to compare.

              If you run -stcox drug i.age_group- followed by -margins age_group- you will get two hazard ratios. What do they mean? Well, these hazard ratios will be adjusted for the distribution of drug, which may well differ among the age groups (especially if this is observational data.) That is, they represent the average hazard ratios, relative to a base state where all of the model variables are 0, among those in each of those two age groups, and with the distribution of drug (or any other model variables) being standardized to that of the overall distribution of drug in the entire sample. The ratio of those two hazard ratios will be similar to the single hazard ratio for age_group shown in the Cox regression output if the distribution of drug (and any other variables) is nearly the same in both age groups.

              Comment

              Working...
              X