merge m:m

amandacpac

Join Date: Sep 2014
Posts: 55

17 Nov 2021, 10:38

I know the issues with M:M merging, but I have 2 datasets that I'm struggling to get to merge correctly. In the first dataset I have multiple pregnancies per person (and their dates and outcomes, etc), and in the second dataset I have multiple addresses per person and the dates they lived there. My ultimate goal is to look to find the address that the subject lived at during that specific pregnancy, so I want to have all the addresses listed for each of the pregnancies. Using the below two example datasets, how would I merge these to get the resulting dataset:

Code:

Dataset 1 ID
Preg_number
Outcome_date

1
1
15feb2001

1
2
01jul2003

1
3
12may2005

2
1
28may2002

2
2
02jan2005


Dataset 2 ID
Address
move_in_dt
move_out_dt

1
123 Main St
01feb2000
31dec2001

1
345 Broadway Ave
01jan2002
31dec2010

2
678 Ocean Blvd
01jan2000
30apr2002

2
910 Frotange Rd
01may2002
31dec2004

2
1112 Smith Dr
01jan2005
31dec2010


Ideal Resulting Dataset ID
Preg_number
Outcome_date
Address
move_in_dt
move_out_dt

1
1
15feb2001
123 Main St
01feb2000
31dec2001

1
1
15feb2001
345 Broadway Ave
01jan2002
31dec2010

1
2
01jul2003
123 Main St
01feb2000
31dec2001

1
2
01jul2003
345 Broadway Ave
01jan2002
31dec2010

1
3
12may2005
123 Main St
01feb2000
31dec2001

1
3
12may2005
345 Broadway Ave
01jan2002
31dec2010

2
1
28may2002
678 Ocean Blvd
01jan2000
30apr2002

2
1
28may2002
910 Frotange Rd
01may2002
31dec2004

2
1
28may2002
1112 Smith Dr
01jan2005
31dec2010

2
2
02jan2005
678 Ocean Blvd
01jan2000
30apr2002

2
2
02jan2005
910 Frotange Rd
01may2002
31dec2004

2
2
02jan2005
1112 Smith Dr
01jan2005
31dec2010

I know I can add the preg_number variable to dataset 2 and just duplicate the addressed, but I have up to 12 pregnancies for some people and my address dataset already has 4155 observations, so I'd rather not do that, if I can avoid it. I feel like there should be a simple fix that I am just missing.

Last edited by amandacpac; 17 Nov 2021, 11:15.

Tags: None

Mike Lacy

Join Date: Apr 2014

Posts: 2416
#2

17 Nov 2021, 11:40

Regarding how to post example data so as to maximize convenience for people who might help you, I'd encourage you to re-read the StataList FAQ, with particular attention to the -dataex- command.

Some advice useful to your current problem appears in Stata's PDF documentation for the -merge- command:

First, if you think you need to perform an m:m merge, then we suspect you are wrong. If you would like to match every observation in the master to every observation in the using with the same values of the key variable(s), then you should be using joinby; see [D] joinby.

Here's how that applies to your situation:

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte(ID Preg_number) str9 Outcome_date 1 1 "15feb2001" 1 2 "01jul2003" 1 3 "12may2005" 2 1 "28may2002" 2 2 "02jan2005" end tempfile data1 save `data1' clear input byte ID int Address str22 move_in_dt str9 move_out_dt 1 123 "Main St 01feb2000" "31dec2001" 1 345 "Broadway Ave 01jan2002" "31dec2010" 2 678 "Ocean Blvd 01jan2000" "30apr2002" 2 910 "Frotange Rd 01may2002" "31dec2004" 2 1112 "Smith Dr 01jan2005" "31dec2010" end // joinby ID using `data1' // Display as you did. sort ID Preg_number order ID Preg_number Outcome_date Address move_in_dt move_out_dt list
2 likes
Comment
amandacpac

Join Date: Sep 2014

Posts: 55
#3

17 Nov 2021, 11:51

Sorry about the coding. I will re-read the FAQs. Also, JOINBY is exactly what I needed. Thank you so much!
Comment

Announcement

merge m:m

Comment

Comment