Replacing if one date is greater than the other + statisfies a condition

Martin Imelda Borg

Join Date: Jan 2022

Posts: 225
#1

Replacing if one date is greater than the other + statisfies a condition

26 Sep 2022, 10:32

Hi I've got a list of dates and a condition I'm interested in --> DVT
var 2 is the date of when the patient was admitted to hospital
opdate when the patient had the operation.

Thus one can conclude if the patient had condition DVT after the patient had the operation then opdate would be > var2

Dataset found below and code used found below

Example generated by -dataex-. For more info, type help dataex
clear
input str4 var1 float var2 byte DVT float(opdateproper postopcomplication)
"date" 1202021 1 1202021 0
"" 13032022 1 13032022 0
"" 14022022 0 . 0
end
[/CODE]
------------------ copy up to and including the previous line ------------------

gen postopcomplication = 0
replace postopcomplication = 1 if DVT == 1 & opdateproper > var2

Questions:
1. Why hasn't stata replaced DVT = 1 for the opdateproper which is marked as . as this is greater than var2
2. Is there a more efficient way of doing this in terms of syntax?
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30078
#2

26 Sep 2022, 10:54

First, your code doesn't even have anything to do with replacing values of DVT, it only creates and modifies a variable postopcomplication. Assuming you meant, "why hasn't Stata replaced postopcomplication = 1..." it is simply because in the observation you single out, DVT = 0, so the combined criterion -DVT == 1 & opdateproper > var2- is not met. Did you perhaps mean to code it as -DVT == 1 | opdateproper > var2-?

But, before you proceed, your data are not ready for this kind of analysis (or any analysis, really). In particular, neither var2 nor opdateproper is a Stata date variable. They are simply numbers the sort of look like dates to human eyes. They need to be converted to Stata date variables using something like:

Code:

foreach v of varlist var2 opdateproper { tostring `v', replace format(%08.0f) gen _`v' = daily(`v', "DMY"), after(`v') format _`v' %td drop `v' rename _`v' `v' }

However, this will not completely solve your problem, because, for example, as plain numbers, the number 1202021 is not even, to human eyes, a valid date in DMY format. As the month would have to be either 0 or 20, neither of which is valid. I'm assuming that your dates are DMY because the other values, 13032022 and 14022022 are only interpretable as dates in that way. So after running the above code, you will be left with some missing values in the date variables that did not arise from missing values in the original. You will have to go back to the creation of this data set to figure out how to fix those.

But be aware that until you fix this problem you will get spurious results. For example 15032022 > 14062022, but interpreted as dates, the ordering is the opposite: 15mar2022 < 14jun2022. So you definitely need to work on this to get other than nonsense.
Comment

Announcement

Replacing if one date is greater than the other + statisfies a condition

Comment