Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Time variable

    Hi! I am new to using stata and am having a little trouble calculating a time to death variable

    Variable 1: death date
    Type: double
    Format %9.0%

    Variable 2: event date
    Type: long
    Format %tdD_m_Y

    I attempted to do:
    gen days=(death-event)

    I end up with 02e+07 for every individual.

    How can I get a correct variable that represents the # of days between event and death?

    Thank you!

  • #2
    "%9.0%" is not a valid format.

    You are probably subtracting two variables that do not have the same unit. The event date seems to be a valid Stata date, that is the number of days since 1960-01-01. You must check the other variable. If these are, say, seconds, that's not going to work.

    The death date must be a number of days too. If it's not, Stata has functions to convert dates and timestamps and it should be possible to find a solution.
    However, I don't know what to do with your %9.0%. The fact that this variable has type "double" suggests that it might be a tc format, and the solution could be as simple as gen days=dofc(death)-event. But you really have to check first the true format of the death date.
    Last edited by Jean-Claude Arbaut; 12 Nov 2018, 15:29.

    Comment


    • #3
      Thank you...the problem may have been how I tried to change the original death date to match another dataset before appending the 2 datasets:

      Dataset 1:
      variable: dod
      Type: str10
      Format: %10s

      gen month = substr(dod,1,2)
      gen day = substr(dod,4,2)
      gen year = substr(dod,7,4)
      gen death = year+month+day

      now Type: str8
      now Format%9s

      destring death, replace
      new type: long
      new format: %10.0g

      Dataset 2
      variable death
      Type: long
      Format: %12.0g

      once I append I get a message "variable death was float, now double to accommodate using data's values"

      Comment


      • #4
        There are possibly several problems here.

        In order to compute with dates, you must use a numeric variable, not a string variable. But you can't just convert a "YMD" string to numeric, that won't work. The number 20181112 is not a valid "td" date. To create a valid date, here are several ways:

        Code:
        display %td mdy(11,13,2018)
        display %td date("2018-11-13","YMD")
        display %td date("20181113","YMD")
        So, for the first dataset a correct way would probably be to apply the date function directly to the dod variable:

        Code:
        gen death=date(dod,"MDY")
        format %td death
        Now you have to check the second dataset: I have no idea how the numeric variable death in is built is this dataset. If it's numerically "yyyymmdd" (for instance 20181113), you can extract the three parts with floor and mod, and use the mdy function. But I don't know for sure, and you have to tell us more.

        Note: using numeric yyyymmdd format is a very bad idea, because the numerical difference between two such dates is usually not the number of days between the dates. You have to use a "td" date for this to work. A td date is a number, which is the number of days since 1960-01-01. Thus the difference is really the number of days in between.
        Last edited by Jean-Claude Arbaut; 12 Nov 2018, 16:24.

        Comment

        Working...
        X