Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge error (ID matched but merged data sometimes erroneous)

    Hello Everybody,

    This is my first time posting, so I hope I'm following the correct protocol. I'm merging two Excel documents and there are 5,597 families that have an identification number in each data set. The first data set has general program information with demographics and the second data set has outcomes. I have successfully merged the data sets by their ID but I am finding a curious error when I double-check the data. The outcomes data that is merged on is sometimes incorrect i.e. the data of an occurrence date of an outcome is sometimes wrong. I pasted an example below and the erroneous dates are in bold blue font; I am using Stata 15.1.

    +-----------------------------------------------------------------------------------------+
    | famid txstart term_date termdate timeintx term_reason2 rerefer1 after_f.. after_f~2|
    |-----------------------------------------------------------------------------------------|
    | 1095732 11 Feb 14 10 Apr 14 10 Apr 14 58 Arrested Yes 26 Feb 14 26 Feb 14
    | 1090180 04 Apr 13 23 Jul 13 23 Jul 13 110 Arrested Yes 29 Jun 13 29 Jun 13



    I sorted the data before doing a 1:1 merge, and it seemed successful because I received the following message:

    . tab _merge_linkagesfinal

    _merge_final | Freq. Percent Cum.
    -----------------------------------------------------------
    using only (2) | 1 0.02 0.02
    matched (3) | 5,596 99.98 100.00
    ------------------------+-----------------------------------
    Total | 5,597 100.00


    I would appreciate any help in figuring out why these date occasionally do not match the original Excel data sets. Thanks!
    Last edited by James Simon; 09 Apr 2020, 13:26. Reason: Trying to align Stata output font.

  • #2
    You're going to run into a couple of things with your posting. First, this is the "Sandbox," which per the description underneath the title, is designed for test postings and the like. People don't look here for postings to answer. Second, people really want to see the exact code you used that generated the problem (see the FAQ re this.) And, finally, example data (search for -dataex- in the FAQ) is always helpful. Diagnosing the data problem will require in particular some information about how the data variables were imported.

    Comment


    • #3
      Thank you. I reposted in the general forum (https://www.statalist.org/forums/for...us#post1546972). I hope this helps.

      How do I delete this post since it is in the wrong place? I don't see an option.
      Last edited by James Simon; 15 Apr 2020, 03:36. Reason: Updating to add new post link.

      Comment


      • #4
        James:
        no need to delete.
        Kind regards,
        Carlo
        (Stata 18.0 SE)

        Comment

        Working...
        X