Merge (m:1) and data ordering

Titir Bhattacharya

Join Date: Mar 2019
Posts: 226

Merge (m:1) and data ordering

02 Mar 2019, 15:19

Hi all,
please consider the following data example:

Code:

/*clear

. input str6 id byte str4 loc

            id        loc
  1. 67HT60 6010
  2. 67HT60 6010
  3. 67HT60 6010
  4. 57HT60 6010
  5. 85PG61 6011
  6. 85PG61 6011
  7. 78PG61 6011
  8. 87PG61 6011
  9. end

. list

     +---------------+
     |     id    loc |
     |---------------|
  1. | 67HT60   6010 |
  2. | 67HT60   6010 |
  3. | 67HT60   6010 |
  4. | 57HT60   6010 |
  5. | 85PG61   6011 |
     |---------------|
  6. | 85PG61   6011 |
  7. | 78PG61   6011 |
  8. | 87PG61   6011 |
     +---------------+

. save "data1"
file data1.dta saved

. clear

. input str4 loc byte str2 prop

           loc       prop
  1. 6010 p1
  2. 6011 p2
  3. end

. list

     +-------------+
     |  loc   prop |
     |-------------|
  1. | 6010     p1 |
  2. | 6011     p2 |
     +-------------+

. save "data2"
file data2.dta saved

. clear

. use "C:\Users\TITIR\Desktop\Courses and papers\Panel and time series\data1.dta", clear

. merge m:1 loc using "data2.dta"

    Result                           # of obs.
    -----------------------------------------
    not matched                             0
    matched                                 8  (_merge==3)
    -----------------------------------------

. list

     +------------------------------------+
     |     id    loc   prop        _merge |
     |------------------------------------|
  1. | 67HT60   6010     p1   matched (3) |
  2. | 67HT60   6010     p1   matched (3) |
  3. | 67HT60   6010     p1   matched (3) |
  4. | 57HT60   6010     p1   matched (3) |
  5. | 85PG61   6011     p2   matched (3) |
     |------------------------------------|
  6. | 85PG61   6011     p2   matched (3) |
  7. | 78PG61   6011     p2   matched (3) |
  8. | 87PG61   6011     p2   matched (3) |
     +------------------------------------+

. drop _merge

. list

     +----------------------+
     |     id    loc   prop |
     |----------------------|
  1. | 67HT60   6010     p1 |
  2. | 67HT60   6010     p1 |
  3. | 67HT60   6010     p1 |
  4. | 57HT60   6010     p1 |
  5. | 85PG61   6011     p2 |
     |----------------------|
  6. | 85PG61   6011     p2 |
  7. | 78PG61   6011     p2 |
  8. | 87PG61   6011     p2 |
     +----------------------+

. save "merged_1"
file merged_1.dta saved

. clear

. input str6 id byte str2 Inc

            id        Inc
  1. 67HT60 I1
  2. 57HT60 I2
  3. 85PG61 I3
  4. 78PG61 I4
  5. 87PG61 I5
  6. end

. list

     +--------------+
     |     id   Inc |
     |--------------|
  1. | 67HT60    I1 |
  2. | 57HT60    I2 |
  3. | 85PG61    I3 |
  4. | 78PG61    I4 |
  5. | 87PG61    I5 |
     +--------------+

. save "data3"
file data3.dta saved

. clear

. use "C:\Users\TITIR\Desktop\Courses and papers\Panel and time series\merged_1.dta", clear

.  merge m:1 id using "data3.dta"

    Result                           # of obs.
    -----------------------------------------
    not matched                             0
    matched                                 8  (_merge==3)
    -----------------------------------------

. list

     +------------------------------------------+
     |     id    loc   prop   Inc        _merge |
     |------------------------------------------|
  1. | 57HT60   6010     p1    I2   matched (3) |
  2. | 67HT60   6010     p1    I1   matched (3) |
  3. | 67HT60   6010     p1    I1   matched (3) |
  4. | 67HT60   6010     p1    I1   matched (3) |
  5. | 78PG61   6011     p2    I4   matched (3) |
     |------------------------------------------|
  6. | 85PG61   6011     p2    I3   matched (3) |
  7. | 85PG61   6011     p2    I3   matched (3) |
  8. | 87PG61   6011     p2    I5   matched (3) |
     +------------------------------------------+

. save "merged_2"
file merged_2.dta saved

. drop _merge

. list

     +----------------------------+
     |     id    loc   prop   Inc |
     |----------------------------|
  1. | 57HT60   6010     p1    I2 |
  2. | 67HT60   6010     p1    I1 |
  3. | 67HT60   6010     p1    I1 |
  4. | 67HT60   6010     p1    I1 |
  5. | 78PG61   6011     p2    I4 |
     |----------------------------|
  6. | 85PG61   6011     p2    I3 |
  7. | 85PG61   6011     p2    I3 |
  8. | 87PG61   6011     p2    I5 |
     +----------------------------+

. save "merge"
file merge.dta saved*/

In the final "merge.dta" the order of "id" is different from that in "data1.dta". How should I code such that the final merged file preserves the order of "id" followed in "data1.dta"?

Thanks.

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30170
#2

02 Mar 2019, 15:35

This is fairly similar to the problem you posted recently. What is needed is to create a variable that represents the sort order of the data. This is best done in a data set where each id uniquely identifies observations, and your data2.dta satisfies this.

Code:

use data2, clear gen long sort_order = _n save data2, replace

Now any data set you create that includes a merger with data2 will contain this sort_order variable. So just run -sort sort_order- and you will have it.
1 like
Comment
Titir Bhattacharya

Join Date: Mar 2019

Posts: 226
#3

02 Mar 2019, 20:39

Originally posted by Clyde Schechter View Post

This is fairly similar to the problem you posted recently. What is needed is to create a variable that represents the sort order of the data. This is best done in a data set where each id uniquely identifies observations, and your data2.dta satisfies this.

Code:

use data2, clear gen long sort_order = _n save data2, replace

Now any data set you create that includes a merger with data2 will contain this sort_order variable. So just run -sort sort_order- and you will have it.

It worked brilliantly. Also, thanks a lot for explaining the rationale. I'm new to stata and your explanation helped a lot.
Comment

Announcement

Merge (m:1) and data ordering

Comment

Comment