Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • plotting upper bounds, lowers bounds of survival probability cox regression

    I would like to plot a decision analysis curve for the 5 year survival, BUT I Can't plot the upper bounds and lower bounds following a cox regression (see red, can anyone tell me what I'm doing wrong?)
    I think it's something mathematical....

    Appreciate if anyone can help....
    Not sure If I'm asking questions in the right way, posted something else for the first time and no answer.. Was told this was a good platform...


    Code:
    stset ttlastfollowbcr, f(bcr)
    
    stcox modelsurgyear modelpsa modelece modelsvi modellni modelpgrade6 modelpgrade7 modelpgrade8 modelpgrade9 ///
        modelexperience, basesurv(surv_function) ///saving out baseline survival 
        
    *get linear predictor for calculation of risk 
    predict xb, xb 
    sum xb, detail. 
    predict se, stdp
    
    *baseline survival at time 5 years = 60 months 
    sum surv_funct if _t <=60
    
    **We want survival closest to 5 years = 60 months , this is lowest survival rate for all survival times < 5 years
    local base =r(min)
    
    **Generate 5 year risk using baseline survival converts into a probability 
    gen risk5y= 1-`base'^exp(xbeta) // converts to a probability 
    g slb= 1-`base'^exp(xbeta)+ invnorm(0.975*se) // unknown function **problem here
    g sub= 1-`base'^exp(xbeta)- invnorm(0.975*se) //unknwon function **problem here 
    
    
    label var risk5y "pROBABILITY OF FAILURE AT 5 YEARS"
    
    *Plot 5 year predicted probability 
    twoway (line risk5y slb sub experience, sort clpat(solid solid solid) clwidth(medthick thin thin) clcolor(gs0 gs0 gs0)), // 
    ytitle(5 year probability of freedrom from recurrence (%), margin(medium)) ylabel(75(5)100, angle(horizontal) nogrid) xtitle(experience (number of cases), margin(medium)) xlabel(0(250)1750) ///legend(off) scheme(s2mono) graphregion(fcolor(white))
    Here's a sample of the data
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte bcr float(dod ttlastfollowbcr) byte(_st _d) double _t byte _t0 float(modelsurgyear modelpsa modelece modelsvi modellni modelpgrade5 modelpgrade6 modelpgrade7 modelpgrade8 modelpgrade9 modelexperience)
    0 0 11.728952 1 0 11.728952407836914 0 2003     7 0 0 0 0 1 0 0 0 1875
    0 0  52.92813 1 0 52.928131103515625 0 2000   6.5 0 0 0 0 1 0 0 0    8
    0 0 37.585217 1 0   37.5852165222168 0 2002   1.8 0 0 0 0 0 1 0 0  257
    1 0 2.2340863 1 1  2.234086275100708 0 1991  25.3 1 1 1 0 1 0 0 0   93
    0 0  4.796715 1 0  4.796714782714844 0 1997  4.96 0 0 0 0 0 1 0 0  586
    0 0   80.4271 1 0   80.4271011352539 0 1997   6.7 1 0 0 0 0 1 0 0 1101
    0 0  86.53799 1 0   86.5379867553711 0 1996   6.1 1 0 0 0 0 1 0 0  181
    0 0  64.03285 1 0  64.03285217285156 0 1999    14 0 0 0 0 1 0 0 0  340
    0 0  1.379877 1 0 1.3798768520355225 0 2001   6.8 0 0 0 0 1 0 0 0   82
    0 0  54.76797 1 0 54.767967224121094 0 1993 10.62 0 1 0 0 0 1 0 0   44
    0 0  19.61396 1 0 19.613962173461914 0 2003  4.24 0 0 0 0 1 0 0 0 1917
    0 0 114.85831 1 0 114.85831451416016 0 1993   7.3 0 0 0 1 0 0 0 0  237
    0 0  1.379877 1 0 1.3798768520355225 0 1998     5 0 0 0 0 1 0 0 0   65
    0 0  1.379877 1 0 1.3798768520355225 0 2000   6.8 0 0 0 0 1 0 0 0   55
    0 0  90.18481 1 0  90.18480682373047 0 1995     5 0 0 0 0 1 0 0 0  825
    0 0  14.29158 1 0 14.291581153869629 0 2002   5.2 0 0 0 0 0 1 0 0  667
    0 0 68.336754 1 0  68.33675384521484 0 1992   6.5 1 0 0 1 0 0 0 0  142
    0 0  91.20329 1 0  91.20328521728516 0 1995  26.6 1 0 0 0 0 1 0 0   36
    0 0  38.93224 1 0   38.9322395324707 0 2002  6.17 0 0 0 0 1 0 0 0  262
    0 0  35.97536 1 0  35.97536087036133 0 1998     4 0 0 0 0 1 0 0 0  694
    end

  • #2
    You have done well providing a data example and code, but you need to ensure that your problem is reproducible. Here are some issues that I see:

    sum surv_funct if _t <=60
    1. The data example does not allow one to execute the stcox command, so I do not know if a variable named "surv_function" is generated by this command. If not, the local "base" is empty from

    local base =r(min)
    Now, Stata encounters an expression of the form:

    gen risk5y= 1-^exp(xbeta)
    and inputting some numbers will replicate your error:

    Code:
    display 1-^exp(12)
    Res.:

    Code:
    . display 1-^exp(12)
    unknown function ^exp()
    r(133);

    2. From

    gen risk5y= 1-`base'^exp(xbeta)
    I do not see a variable named "xbeta" generated. If you are using the result from predict, you called the generated variable "xb".

    *get linear predictor for calculation of risk
    predict xb, xb
    If the above data and code are not exhaustive and you indeed have all the variables, the following should be useful in helping you understand the behavior of local macros: https://journals.sagepub.com/doi/10....36867X20931028. The macro will not exist if you are running the code in chunks and the particular selection does not include the definition of the macro.

    Comment


    • #3
      Hi Andrew Musau thanks for trying to answer my question

      I don't understand why the data doesn't allow you to execture the stcox, doesn't one just need to drop the variable generated by stxcox?

      I've dropped the variables just in case:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte bcr float(dod ttlastfollowbcr modelsurgyear modelpsa modelece modelsvi modellni modelpgrade5 modelpgrade6 modelpgrade7 modelpgrade8 modelpgrade9 modelexperience)
      0 0 11.728952 2003     7 0 0 0 0 1 0 0 0 1875
      0 0  52.92813 2000   6.5 0 0 0 0 1 0 0 0    8
      0 0 37.585217 2002   1.8 0 0 0 0 0 1 0 0  257
      1 0 2.2340863 1991  25.3 1 1 1 0 1 0 0 0   93
      0 0  4.796715 1997  4.96 0 0 0 0 0 1 0 0  586
      0 0   80.4271 1997   6.7 1 0 0 0 0 1 0 0 1101
      0 0  86.53799 1996   6.1 1 0 0 0 0 1 0 0  181
      0 0  64.03285 1999    14 0 0 0 0 1 0 0 0  340
      0 0  1.379877 2001   6.8 0 0 0 0 1 0 0 0   82
      0 0  54.76797 1993 10.62 0 1 0 0 0 1 0 0   44
      0 0  19.61396 2003  4.24 0 0 0 0 1 0 0 0 1917
      0 0 114.85831 1993   7.3 0 0 0 1 0 0 0 0  237
      0 0  1.379877 1998     5 0 0 0 0 1 0 0 0   65
      0 0  1.379877 2000   6.8 0 0 0 0 1 0 0 0   55
      0 0  90.18481 1995     5 0 0 0 0 1 0 0 0  825
      0 0  14.29158 2002   5.2 0 0 0 0 0 1 0 0  667
      0 0 68.336754 1992   6.5 1 0 0 1 0 0 0 0  142
      0 0  91.20329 1995  26.6 1 0 0 0 0 1 0 0   36
      0 0  38.93224 2002  6.17 0 0 0 0 1 0 0 0  262
      0 0  35.97536 1998     4 0 0 0 0 1 0 0 0  694
      end
      Answers to your questions:

      Yes the surv-function is generated from stcox - as seen from the code below
      So the local base isn't empty
      Thanks for pointing out the xbeta error in the variable name - I have fixed it, but I still get the same problem.

      g slb= 1-`base'^exp(xb)+ invnorm(0.975*se) // unknown function **problem here
      unknown function ^exp()
      r(133);



      My question for yourself & if you don't for anyone in this forum who may know,
      Is the code to plot the Upper and lower confidence intervals correct following stcox? Or does it need to be followed by some other mathematical components?
      I have not found any solutions to this anywhere....

      Code:
      stset ttlastfollowbcr, f(bcr)
      
      stcox modelsurgyear modelpsa modelece modelsvi modellni modelpgrade6 modelpgrade7 modelpgrade8 modelpgrade9 ///
          modelexperience, basesurv(surv_function) ///saving out baseline survival 
          
      *get linear predictor for calculation of risk 
      predict xb, xb 
      sum xb, detail
      predict se, stdp
      
      *baseline survival at time 5 years = 60 months 
      sum surv_funct if _t <=60
      
      **We want survival closest to 5 years = 60 months , this is lowest survival rate for all survival times < 5 years
      local base =r(min)
      
      **Generate 5 year risk using baseline survival converts into a probability 
      gen risk5y= 1-`base'^exp(xb) // converts to a probability 
      g slb= 1-`base'^exp(xb)+ invnorm(0.975*se) // unknown function **problem here
      g sub= 1-`base'^exp(xb)- invnorm(0.975*se) //unknwon function **problem here 
      
      
      label var risk5y "pROBABILITY OF FAILURE AT 5 YEARS"
      
      *Plot 5 year predicted probability 
      twoway (line risk5y slb sub experience, sort clpat(solid solid solid) clwidth(medthick thin thin) clcolor(gs0 gs0 gs0)), // 
      ytitle(5 year probability of freedrom from recurrence (%), margin(medium)) ylabel(75(5)100, angle(horizontal) nogrid) xtitle(experience (number of cases), margin(medium)) xlabel(0(250)1750) legend(off) scheme(s2mono) graphregion(fcolor(white))

      Comment


      • #4
        Originally posted by Rose Matthews View Post
        I don't understand why the data doesn't allow you to execture the stcox, doesn't one just need to drop the variable generated by stxcox?
        Essentially, you need to present a data example that needs no modifications when combined with your code. That's the definition of a reproducible example.


        g slb= 1-`base'^exp(xb)+ invnorm(0.975*se) // unknown function **problem here
        unknown function ^exp()
        r(133);
        I addressed this in #2. As long as the local is not empty, you will not get this error. The implication is that you need to run all the code in one go.


        Is the code to plot the Upper and lower confidence intervals correct following stcox? Or does it need to be followed by some other mathematical components?
        I have not found any solutions to this anywhere
        See https://journals.sagepub.com/doi/pdf...867X1101100104 for different ways to calculate the confidence intervals for the survivor function after stcox, if this is what you are after. The command from the Stata Journal will automatically do this for you without needing to manually program anything.

        Code:
        search survci

        Comment


        • #5
          Andrew Musau I've tried your advice with survci...
          But I get the following graph which doesn't make sense


          Click image for larger version

Name:	Screenshot 2023-09-06 at 13.40.05.png
Views:	1
Size:	99.0 KB
ID:	1726184

          Is there anyone who can help ? I am trying to plot a diagnostic analytical curve as seen here -
          Sourced from here: https://mskcc-epi-bio.github.io/deci...ic_Data_Set-up

          Firstly when I use survci - I Get a dead flat line

          Second, when I run the local macro, it creates 20 missing values so the graph plotted is empty...

          Can anyone help me find the error?

          Code:
          stset ttlastfollowbcr, f(bcr)
          
          stcox modelsurgyear modelpsa modelece modelsvi modellni modelpgrade6 modelpgrade7 modelpgrade8 modelpgrade9 ///
              modelexperience, basesurv(surv_function) ///saving out baseline survival 
              
          *get linear predictor for calculation of risk 
          predict xb, xb 
          sum xb, detail
          predict se, stdp
          
          *baseline survival at time 5 years = 60 months 
          sum surv_funct if _t <=60
          
          **We want survival closest to 5 years = 60 months , this is lowest survival rate for all survival times < 5 years
          local base =r(min)
          
          **Generate 5 year risk using baseline survival converts into a probability 
          gen risk5y= 1-`base'^exp(xb) // converts to a probability 
          label var risk5y "pROBABILITY OF FAILURE AT 5 YEARS"
          
          survci      ///Graph isn't good (see attached) ????
          
          g slb= 1-`base'^exp(xb)+ invnorm(0.975*se) //  20 missing values
          g sub= 1-`base'^exp(xb)- invnorm(0.975*se) //20 missing values
          
          
          label var risk5y "pROBABILITY OF FAILURE AT 5 YEARS"
          
          *Plot 5 year predicted probability 
          twoway (line risk5y slb sub experience, sort clpat(solid solid solid) clwidth(medthick thin thin) clcolor(gs0 gs0 gs0)), // 
          ytitle(5 year probability of freedrom from recurrence (%), margin(medium)) ylabel(75(5)100, angle(horizontal) nogrid) xtitle(experience (number of cases), margin(medium)) xlabel(0(250)1750) legend(off) scheme(s2mono) graphregion(fcolor(white))
          Attached Files

          Comment


          • #6
            The link that you show (which you should have shown in #1) has Stata code. The author has a command stdca that plots the decision curve. You can install this from

            Code:
            net install dca, from(https://raw.github.com/ddsjoberg/stata.dca/master)
            Now, before you try to modify the author's code so as to include confidence intervals, make sure that you are able to generate the decision curve with the author's command in the first place. I am assuming that you have more than the 20 observations that you present in #1. If not, then the problem is with your data. I will now bail out of this thread because my intention is to provide general advice or to address some small issue that I see, but not to do someone's project. If you are finding it difficult to establish if there are any problems with your data or to follow the author's code from the link, either talk to your supervisor or a colleague who has experience with Stata and is familiar with your data and decision curve analysis. Someone else may respond here, but I doubt that they will be willing to go deep into understanding your specific research questions and data. Also, you may want to email the author and ask if he has ideas on including confidence intervals to the decision curve generated by his command once you resolve any problems with your data.
            Last edited by Andrew Musau; 07 Sep 2023, 02:57.

            Comment

            Working...
            X