Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging / matching by closest date

    Hi all, is there a way to merge / match by the ID variable and the closest date?

    I have data from two forms - an antenatal and last desk form. Participants in this study have multiple visits, and normally complete the antenatal form and the last desk form on the same day, making it easy to merge by the id variable plus the visit date. (The ID variable is rida on the antenatal form and ridl on the last desk form- they are the same for a given person). However, sometimes the last desk form is completed on a later date and when I try to merge the two forms together, the data for these visits ends up on two lines (i.e. does not merge) - see the final four lines in the example below for two people who completed their antenatal and last desk forms completed on different days. My goal is to have all the variables for that visit date combo appear in one line (not two).

    visdta = visit date for antenatal form; visdtl = visit date for the last desk form

    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long rida int visdta double bpavgsysa long ridl int visdtl byte highbpdl
    910384 22064 102.5 910384 22064 2
    910384 22078 91.5 910384 22078 2
    910385 21994 97.5 910385 21994 2
    910385 22025 98.5 910385 22025 2
    910385 22053 115.5 910385 22053 2
    910385 22078 106 910385 22078 2
    910386 21997 103.5 910386 21997 2
    910386 22025 105 910386 22025 2
    910386 22054 109.5 910386 22054 2
    910386 22083 105.5 910386 22083 2
    910384 22013 98.5 . . .
    . . . 910384 22017 2
    910385 22119 105.5 . . .
    . . . 910385 22124 2
    end
    format %tddd-Mon-YY visdta
    format %tddd-Mon-YY visdtl
    [/CODE]


    I tried the anymatch command (after copying the ridl value over to the rida slot and the visdtl over to the visdta slot):

    replace ride = ridl if ride==.
    replace visdta = visdtl if visdta==.
    anymatch visdta, id(rida) metric(visdta) near(1)

    but got the error "observations not uniquely identified by "rida" r(198);

    Any ideas for how to do this, either with anymatch or another command?
    Thanks,
    ~Cristina

  • #2
    If you have only one record per id, then use only the id variable to merge. On the other hand, with multiple records per id, you can form pairwise combinations within id using joinby and keep pairs with the minimum difference in dates. If you have difficulties doing this, copy and paste the output from the following from each of the datasets (before merging).

    Code:
    dataex if inrange(rida, 910384, 910387)

    Comment


    • #3
      Thank you very much, I will try this out!

      Comment


      • #4
        Thanks for this suggestion but there are cases in my dataset where there are multiple unmatched lines for the same person, so this method wouldn't allow those to be matched correctly. I believe I need a matching that involves the closest date. Does anyone have any ideas for how to do that?

        Comment


        • #5
          Thanks for this suggestion but there are cases in my dataset where there are multiple unmatched lines for the same person, so this method wouldn't allow those to be matched correctly. I believe I need a matching that involves the closest date. Does anyone have any ideas for how to do that?

          Comment


          • #6
            My guess is that unmatched records are those present in one dataset but not both. I would suggest that you present samples of both datasets illustrating some of the issues that you are encountering and your expectation on how to handle records not present in both datasets. However, this suggestion does not preclude anyone else from responding based on your description.

            Comment


            • #7
              You can use the user-written command nearmrg to merge based on nearest values, which includes dates. Check - https://ideas.repec.org/c/boc/bocode/s434901.html

              Comment

              Working...
              X