Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merge m:m

    I know the issues with M:M merging, but I have 2 datasets that I'm struggling to get to merge correctly. In the first dataset I have multiple pregnancies per person (and their dates and outcomes, etc), and in the second dataset I have multiple addresses per person and the dates they lived there. My ultimate goal is to look to find the address that the subject lived at during that specific pregnancy, so I want to have all the addresses listed for each of the pregnancies. Using the below two example datasets, how would I merge these to get the resulting dataset:

    Code:
    Dataset 1 
    ID Preg_number Outcome_date
    1 1 15feb2001
    1 2 01jul2003
    1 3 12may2005
    2 1 28may2002
    2 2 02jan2005
    Dataset 2
    ID Address move_in_dt move_out_dt
    1 123 Main St 01feb2000 31dec2001
    1 345 Broadway Ave 01jan2002 31dec2010
    2 678 Ocean Blvd 01jan2000 30apr2002
    2 910 Frotange Rd 01may2002 31dec2004
    2 1112 Smith Dr 01jan2005 31dec2010
    Ideal Resulting Dataset
    ID Preg_number Outcome_date Address move_in_dt move_out_dt
    1 1 15feb2001 123 Main St 01feb2000 31dec2001
    1 1 15feb2001 345 Broadway Ave 01jan2002 31dec2010
    1 2 01jul2003 123 Main St 01feb2000 31dec2001
    1 2 01jul2003 345 Broadway Ave 01jan2002 31dec2010
    1 3 12may2005 123 Main St 01feb2000 31dec2001
    1 3 12may2005 345 Broadway Ave 01jan2002 31dec2010
    2 1 28may2002 678 Ocean Blvd 01jan2000 30apr2002
    2 1 28may2002 910 Frotange Rd 01may2002 31dec2004
    2 1 28may2002 1112 Smith Dr 01jan2005 31dec2010
    2 2 02jan2005 678 Ocean Blvd 01jan2000 30apr2002
    2 2 02jan2005 910 Frotange Rd 01may2002 31dec2004
    2 2 02jan2005 1112 Smith Dr 01jan2005 31dec2010
    I know I can add the preg_number variable to dataset 2 and just duplicate the addressed, but I have up to 12 pregnancies for some people and my address dataset already has 4155 observations, so I'd rather not do that, if I can avoid it. I feel like there should be a simple fix that I am just missing.



    Last edited by amandacpac; 17 Nov 2021, 11:15.

  • #2
    Regarding how to post example data so as to maximize convenience for people who might help you, I'd encourage you to re-read the StataList FAQ, with particular attention to the -dataex- command.

    Some advice useful to your current problem appears in Stata's PDF documentation for the -merge- command:

    First, if you think you need to perform an m:m merge, then we suspect you are wrong. If you would like to match every observation in the master to every observation in the using with the same values of the key variable(s), then you should be using joinby; see [D] joinby.

    Here's how that applies to your situation:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(ID Preg_number) str9 Outcome_date
    1 1 "15feb2001"
    1 2 "01jul2003"
    1 3 "12may2005"
    2 1 "28may2002"
    2 2 "02jan2005"
    end
    tempfile data1
    save `data1'
    clear
    input byte ID int Address str22 move_in_dt str9 move_out_dt
    1  123 "Main St 01feb2000"      "31dec2001"
    1  345 "Broadway Ave 01jan2002" "31dec2010"
    2  678 "Ocean Blvd 01jan2000"   "30apr2002"
    2  910 "Frotange Rd 01may2002"  "31dec2004"
    2 1112 "Smith Dr 01jan2005"     "31dec2010"
    end
    //
    joinby ID using `data1'
    // Display as you did.
    sort ID Preg_number
    order ID Preg_number Outcome_date Address move_in_dt move_out_dt
    list

    Comment


    • #3
      Sorry about the coding. I will re-read the FAQs. Also, JOINBY is exactly what I needed. Thank you so much!

      Comment

      Working...
      X