Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • trying to use rangejoin - never had an issue before

    Hi I've used rangejoin before and never had issues.
    Somehow this time it won't work


    Dataset1 : admissiondate uniqueid gender - variables
    Dataset2: surgerydate uniqueid gender

    UniqueId in both datasets are the same. However in dataset2, there are several duplicate uniqueids and therefore not unique values. I want to try and eliminate those duplicate uniqueids that have incorrect information.

    Aim: To bring admission date from Dataset1 onto Dataset2 for each uniqueid that is found in both Dataset 1 and Dataset2 using UniqueID and Gender otherwise it is tagged as missing.
    Following this I will then check which ones have admissiondate=surgerydate. If != that it is tagged and then dropped.

    Code:
    clear all
    use "dataset1"
    rangejoin admissiondate using "Dataset2", by (uniqueid gender)
    Error: was expecting a numeric variable, a number or a system missing value for the interval low

    I used the above same code for another rangejoin and never had issues

    admissiondate - stata date
    uniqueid - double
    gender double

  • #2
    I used the above same code for another rangejoin and never had issues
    I really doubt that. It's clearly not compatible with -rangejoin-'s syntax. You must specify a lower and upper bound for the rangejoin following admissiondate and before using.

    Also, the variable that keys the range must be a variable in the using data, not the master. So you can't use admissiondate there.
    Code:
    rangejoin surgerydate . . using "Dataset2", by(uniqueid gender)
    But, if you don't want to constrain which surgerydate values get paired up with which admissiondate values, then why use -rangejoin- at all. Just do
    Code:
    joinby uniqueid gender using "Dataset2"
    I suspect, though, that you really will want to constrain which surgery dates get paired with which admission dates. Say, maybe, within 90 days of each other? That would be
    Code:
    gen lb = admissiondate - 90
    gen ub = admissiondate + 90
    rangejoin surgerydate lb ub using "Dataset2", by(uniqueid gender)
    Always remember: the key variable for the rangejoin comes from the using data set, and the bounds come from the master data set.

    Comment

    Working...
    X