Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Replacing if one date is greater than the other + statisfies a condition

    Hi I've got a list of dates and a condition I'm interested in --> DVT
    var 2 is the date of when the patient was admitted to hospital
    opdate when the patient had the operation.

    Thus one can conclude if the patient had condition DVT after the patient had the operation then opdate would be > var2

    Dataset found below and code used found below


    Example generated by -dataex-. For more info, type help dataex
    clear
    input str4 var1 float var2 byte DVT float(opdateproper postopcomplication)
    "date" 1202021 1 1202021 0
    "" 13032022 1 13032022 0
    "" 14022022 0 . 0
    end
    [/CODE]
    ------------------ copy up to and including the previous line ------------------

    gen postopcomplication = 0
    replace postopcomplication = 1 if DVT == 1 & opdateproper > var2


    Questions:
    1. Why hasn't stata replaced DVT = 1 for the opdateproper which is marked as . as this is greater than var2
    2. Is there a more efficient way of doing this in terms of syntax?



  • #2
    First, your code doesn't even have anything to do with replacing values of DVT, it only creates and modifies a variable postopcomplication. Assuming you meant, "why hasn't Stata replaced postopcomplication = 1..." it is simply because in the observation you single out, DVT = 0, so the combined criterion -DVT == 1 & opdateproper > var2- is not met. Did you perhaps mean to code it as -DVT == 1 | opdateproper > var2-?

    But, before you proceed, your data are not ready for this kind of analysis (or any analysis, really). In particular, neither var2 nor opdateproper is a Stata date variable. They are simply numbers the sort of look like dates to human eyes. They need to be converted to Stata date variables using something like:

    Code:
    foreach v of varlist var2 opdateproper {
        tostring `v', replace format(%08.0f)
        gen _`v' = daily(`v', "DMY"), after(`v')
        format _`v' %td
        drop `v'
        rename _`v' `v'
    }
    However, this will not completely solve your problem, because, for example, as plain numbers, the number 1202021 is not even, to human eyes, a valid date in DMY format. As the month would have to be either 0 or 20, neither of which is valid. I'm assuming that your dates are DMY because the other values, 13032022 and 14022022 are only interpretable as dates in that way. So after running the above code, you will be left with some missing values in the date variables that did not arise from missing values in the original. You will have to go back to the creation of this data set to figure out how to fix those.

    But be aware that until you fix this problem you will get spurious results. For example 15032022 > 14062022, but interpreted as dates, the ordering is the opposite: 15mar2022 < 14jun2022. So you definitely need to work on this to get other than nonsense.

    Comment

    Working...
    X