Dear Statalisters,
I have the following two datasets (Datase-1 and Dataset2). Dataset-1 has arround 5million observations for repeated ids and dates (qdate). Dataset-2 has common ids but different dates (sdate) with 1million observations. Being aware of the fact that m:m is a bad thing to do, I was hoping to use 'joinby' having dataset-1 as master and 2 as 'using' and bring 'sdate' from dataset-2 to dataset-1. My Stata freezes. I will highly appreciate any suggestion on how to merge these two datasets.
In a vain attempt, I splitted the dataset-1 as 4 parts with rougly equal observations (±1milliion) but still Stata freezes. Please see below the dataset structures. In the examples below though I presented on date for brevity, the dates will vary within person in both datasets.
Dataset1:
Dataset2:
//Joinby command:
Stata version: Version 17.0
System info:
I have the following two datasets (Datase-1 and Dataset2). Dataset-1 has arround 5million observations for repeated ids and dates (qdate). Dataset-2 has common ids but different dates (sdate) with 1million observations. Being aware of the fact that m:m is a bad thing to do, I was hoping to use 'joinby' having dataset-1 as master and 2 as 'using' and bring 'sdate' from dataset-2 to dataset-1. My Stata freezes. I will highly appreciate any suggestion on how to merge these two datasets.
In a vain attempt, I splitted the dataset-1 as 4 parts with rougly equal observations (±1milliion) but still Stata freezes. Please see below the dataset structures. In the examples below though I presented on date for brevity, the dates will vary within person in both datasets.
Dataset1:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double id int qdate 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 10013930 22492 end format %td qdate
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long id int sdate 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 10013930 22292 end format %td sdate
Code:
joinby id using data2.dta, unmatched(both)
System info:
Code:
Hardware Overview: Model Name: MacBook Air Chip: Apple M2 Total Number of Cores: 8 (4 performance and 4 efficiency) Memory: 8 GB System Firmware Version: 8419.80.7 OS Loader Version: 8419.80.7 OS: Ventura 13.2.1

Comment