Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cannot get ipdfc to work

    I happened to come across a user written program which might be the solution to what I am attempting to do (compare a published KM curve to one we are working on locally).

    I am using ipdfc which was written to reconstruct individual participant data from a published Kaplan-Meier curve (findit ipdfc).

    I have digitised the KM curve from the following paper and wanted to use the ipdfc command to reconstruct the individual participant data.

    https://www.ncbi.nlm.nih.gov/pmc/art...v038p00769.pdf

    based on this, i have extracted the following data and points.

    Code:
    femalesagexdermott femalesproportionydermott
    2.250259 1.0009734
    6.7189907 .99996194
    11.187561 1.000384
    15.656271 .99956804
    20.125786 .99158437
    24.48281 .98801388
    29.064506 .97837551
    33.345984 .97288928
    38.048789 .93842834
    42.476369 .92490446
    47.000088 .9079631
    51.273208 .87864681
    54.617883 .81446671
    58.031029 .73895572
    62.397216 .70367233
    65.917298 .62334907
    67.837164 .58112485
    68.801202 .52348231
    70.192767 .44853756
    72.75203 .3972092
    75.418717 .33673176
    77.021841 .27274933
    79.189122 .21373451
    81.609278 .16217862
    83.74858 .06082447
    You may note that the source publication does not include a risk table, so i am clean out of luck there. However, i have read from the help file associated with ipdfc, an approach is to enter 0 for the trisk option and the total number of participants (60) for the nrisk option.

    Armed with all this, i have put together the following syntax

    Code:
    ipdfc, surv(femalesproportionydermott) tstart(femalesagexdermott) trisk(0) nrisk(60) generate(FEMALEDERMOTT1 FEMALEDERMOTT2) saving(TEST2) proportion isotonic
    This returns:

    Code:
    0 invalid name
    (error in option trisk())
    if i stipulate a dummy variable for both
    Code:
    gen DUMMYZERO=0
    gen POPULATION=60
    Re-running the command:

    Code:
    ipdfc, surv(femalesproportionydermott) tstart(femalesagexdermott) trisk(DUMMYZERO) nrisk(POPULATION) generate(FEMALEDERMOTT1 FEMALEDERMOTT2) saving(TEST2) proportion isotonic
    this returns:

    Code:
    '0' invalid observation number
    It is of note that the command is at least doubling the number of lines in the file at some stage in its execution, but provides the error.


    I am using Stata 15.1 IC. I have also attempted to specify previous versions 14 and 13 using the version command to get this working. A colleague has tried on Stata 16 also, but to no avail.

    I have also tried to contact the original authors of the program, but their Stata profiles did not come up when i looked them up by name.

    A Google search and a search on the Stata forum has provided little further insight.

    Does anyone out there have experience using this ado in situations where the life table is not available?

    Thanks in advance,

    Geoff

  • #2
    I haven't used - ipdfc - but, by looking at the help files, I see the option - probability - but I fail to see the - proportion - option.

    By the way, Stata is also telling you there is an "error in option trisk". The help files recommend to use "varname" instead of zero for trisk, unless "the total number of patients in the sample is knwon".

    Hopefully that helps.
    Best regards,

    Marcos

    Comment


    • #3
      Hi Geoff,

      I'm also using this program and am stuck in the same position where no risk table is available. I'm wondering if you ever figured this out?

      Thanks,

      ​​​

      Comment


      • #4
        Hi all,

        I am also using this program, and ran into the same issue.

        Taking what Marco said, I created a new variable, trisk_na and nrisk_na where I set trisk to zero and nrisk to the number of people within the study. Note that in the original paper, the authors state that we need to provide the number of patients in the sample. This solved the initial, error, which was:

        Code:
         
         '0' invalid observation number
        However, this gave me a new error:

        Code:
        observation number out of range
            Observation number must be between 154 and 2,147,483,619.  (Observation
            numbers are typed without commas.)
        I have reached out to the author of the program, but I would appreciate any suggestions. Reproducible code and data are available below.

        Best,
        Kyle


        Code:
        // Conversion of extracted KM curve data to time-to-event data
        // Kyle Monahan
        // Tufts Technology Services
        // 5/4/2020
        
        // Turn on debugging
         set trace on
        
        // Load the package if you haven't already
        net install st0498.pkg
        
        // Clear previous data
        clear
        
        // Note to future users - change directory or place SampleData_AllData.xlsx in the same directory
        // Download sample data here: https://tufts.box.com/shared/static/7znmk3azhh9a5v3q9787e7mtngtt154a.xlsx
        
        // Load sample Data
        import excel "SampleData_AllData.xlsx", sheet("Sample") firstrow
        
        
        // Run model
        ipdfc, surv(s0) tstart(ts0) trisk(trisk) nrisk(nrisk0) isotonic ///
         generate(t_ipd event_ipd) saving(output)
        
        // This gives an error: observation number out of range
            // Observation number must be between 154 and 2,147,483,619.  (Observation
            // numbers are typed without commas.)
            
        // Try to run the model without trisk. From Wei et al.:
        // "Set trisk() as 0 only if the total number of patients in the sample is known. trisk() is required."
        // "If no risk table is available, specify nrisk() as the number of patients in the sample, and specify trisk() as 0."
        
        // Run model without risk table
        ipdfc, surv(s0) tstart(ts0) trisk(trisk_na) nrisk(nrisk_na) isotonic ///
         generate(t_ipd2 event_ipd2) saving(output2)

        Comment


        • #5
          Hi all,

          Has anyone been able to contact the authors or seem to get the code to work?

          I am also working on extracting the number of events from a KM curve but am running into trouble when the number at risk is greater than 1000. In particular when I perform stset on the resulting data the last observed exit is the second time point (out of 92 time points) taken from the digitized plot.

          Best,
          Abdullah

          Comment


          • #6
            Not sure if this helps anyone, but I was able to replicate the scenario using the data provided with the ipdfc command and got it to work. Maybe the problem is that it was meant to be run separately for the control and treatment groups and then merged into one file for analysis. You also can't specify trisk(0) in the code, there needs to be a variable in the data file that provides the 0



            Code:
            import delimited using "head_and_neck_arm0.txt"
            
            *I manually went in and deleted all values for trisk and nrisk other than the initial 0 and 211 (where 211 is the sample size)
            
            ipdfc, surv(s) tstart(ts) trisk(trisk) nrisk(nrisk) isotonic generate(t_ipd event_ipd) saving(temp0a)
            
            import delimited using "head_and_neck_arm1.txt", clear
            *did the same thing here, erasing the values for trisk and nrisk aside from 0 and 213 (n = 213)
            
            ipdfc, surv(s) tstart(ts) trisk(trisk) nrisk(nrisk) isotonic generate(t_ipd event_ipd) saving(temp1a)
            
            
            use temp0a, clear
            gen byte arm = 0
            append using temp1a
            replace arm = 1 if missing(arm)
            
            label define ARM 0 "Radiotherapy" 1 "Radiotherapy plus cetuximab"
            label values arm ARM
            
            stset t_ipd, failure(event_ipd)
            
            sts graph, by(arm) title("") xlabel(0(10)70) ylabel(0(0.2)1) ///
                risktable(0(10)50, order(2 "Radiotherapy" 1 "Radiotherapy plus")) ///
                xtitle("Months") l2title("Locoregional control") ///
                scheme(sj) graphregion(fcolor(white)) ///
                plot1opts(lpattern(solid) lcolor(gs12)) ///
                plot2opts(lpattern(solid) lcolor(black)) ///
                text(-0.38 -9.4 "cetuximab") ///
                legend(off) ///
                text (0.52 53 "Radiotherapy plus cetuximab") text(0.20 60 "Radiotherapy")

            Comment

            Working...
            X