Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge (m:1) and data ordering

    Hi all,
    please consider the following data example:

    Code:
    /*clear
    
    . input str6 id byte str4 loc
    
                id        loc
      1. 67HT60 6010
      2. 67HT60 6010
      3. 67HT60 6010
      4. 57HT60 6010
      5. 85PG61 6011
      6. 85PG61 6011
      7. 78PG61 6011
      8. 87PG61 6011
      9. end
    
    . list
    
         +---------------+
         |     id    loc |
         |---------------|
      1. | 67HT60   6010 |
      2. | 67HT60   6010 |
      3. | 67HT60   6010 |
      4. | 57HT60   6010 |
      5. | 85PG61   6011 |
         |---------------|
      6. | 85PG61   6011 |
      7. | 78PG61   6011 |
      8. | 87PG61   6011 |
         +---------------+
    
    . save "data1"
    file data1.dta saved
    
    . clear
    
    . input str4 loc byte str2 prop
    
               loc       prop
      1. 6010 p1
      2. 6011 p2
      3. end
    
    . list
    
         +-------------+
         |  loc   prop |
         |-------------|
      1. | 6010     p1 |
      2. | 6011     p2 |
         +-------------+
    
    . save "data2"
    file data2.dta saved
    
    . clear
    
    . use "C:\Users\TITIR\Desktop\Courses and papers\Panel and time series\data1.dta", clear
    
    . merge m:1 loc using "data2.dta"
    
        Result                           # of obs.
        -----------------------------------------
        not matched                             0
        matched                                 8  (_merge==3)
        -----------------------------------------
    
    . list
    
         +------------------------------------+
         |     id    loc   prop        _merge |
         |------------------------------------|
      1. | 67HT60   6010     p1   matched (3) |
      2. | 67HT60   6010     p1   matched (3) |
      3. | 67HT60   6010     p1   matched (3) |
      4. | 57HT60   6010     p1   matched (3) |
      5. | 85PG61   6011     p2   matched (3) |
         |------------------------------------|
      6. | 85PG61   6011     p2   matched (3) |
      7. | 78PG61   6011     p2   matched (3) |
      8. | 87PG61   6011     p2   matched (3) |
         +------------------------------------+
    
    . drop _merge
    
    . list
    
         +----------------------+
         |     id    loc   prop |
         |----------------------|
      1. | 67HT60   6010     p1 |
      2. | 67HT60   6010     p1 |
      3. | 67HT60   6010     p1 |
      4. | 57HT60   6010     p1 |
      5. | 85PG61   6011     p2 |
         |----------------------|
      6. | 85PG61   6011     p2 |
      7. | 78PG61   6011     p2 |
      8. | 87PG61   6011     p2 |
         +----------------------+
    
    . save "merged_1"
    file merged_1.dta saved
    
    . clear
    
    . input str6 id byte str2 Inc
    
                id        Inc
      1. 67HT60 I1
      2. 57HT60 I2
      3. 85PG61 I3
      4. 78PG61 I4
      5. 87PG61 I5
      6. end
    
    . list
    
         +--------------+
         |     id   Inc |
         |--------------|
      1. | 67HT60    I1 |
      2. | 57HT60    I2 |
      3. | 85PG61    I3 |
      4. | 78PG61    I4 |
      5. | 87PG61    I5 |
         +--------------+
    
    . save "data3"
    file data3.dta saved
    
    . clear
    
    . use "C:\Users\TITIR\Desktop\Courses and papers\Panel and time series\merged_1.dta", clear
    
    .  merge m:1 id using "data3.dta"
    
        Result                           # of obs.
        -----------------------------------------
        not matched                             0
        matched                                 8  (_merge==3)
        -----------------------------------------
    
    . list
    
         +------------------------------------------+
         |     id    loc   prop   Inc        _merge |
         |------------------------------------------|
      1. | 57HT60   6010     p1    I2   matched (3) |
      2. | 67HT60   6010     p1    I1   matched (3) |
      3. | 67HT60   6010     p1    I1   matched (3) |
      4. | 67HT60   6010     p1    I1   matched (3) |
      5. | 78PG61   6011     p2    I4   matched (3) |
         |------------------------------------------|
      6. | 85PG61   6011     p2    I3   matched (3) |
      7. | 85PG61   6011     p2    I3   matched (3) |
      8. | 87PG61   6011     p2    I5   matched (3) |
         +------------------------------------------+
    
    . save "merged_2"
    file merged_2.dta saved
    
    . drop _merge
    
    . list
    
         +----------------------------+
         |     id    loc   prop   Inc |
         |----------------------------|
      1. | 57HT60   6010     p1    I2 |
      2. | 67HT60   6010     p1    I1 |
      3. | 67HT60   6010     p1    I1 |
      4. | 67HT60   6010     p1    I1 |
      5. | 78PG61   6011     p2    I4 |
         |----------------------------|
      6. | 85PG61   6011     p2    I3 |
      7. | 85PG61   6011     p2    I3 |
      8. | 87PG61   6011     p2    I5 |
         +----------------------------+
    
    . save "merge"
    file merge.dta saved*/
    In the final "merge.dta" the order of "id" is different from that in "data1.dta". How should I code such that the final merged file preserves the order of "id" followed in "data1.dta"?

    Thanks.


  • #2
    This is fairly similar to the problem you posted recently. What is needed is to create a variable that represents the sort order of the data. This is best done in a data set where each id uniquely identifies observations, and your data2.dta satisfies this.

    Code:
    use data2, clear
    gen long sort_order = _n
    save data2, replace
    Now any data set you create that includes a merger with data2 will contain this sort_order variable. So just run -sort sort_order- and you will have it.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      This is fairly similar to the problem you posted recently. What is needed is to create a variable that represents the sort order of the data. This is best done in a data set where each id uniquely identifies observations, and your data2.dta satisfies this.

      Code:
      use data2, clear
      gen long sort_order = _n
      save data2, replace
      Now any data set you create that includes a merger with data2 will contain this sort_order variable. So just run -sort sort_order- and you will have it.
      It worked brilliantly. Also, thanks a lot for explaining the rationale. I'm new to stata and your explanation helped a lot.

      Comment

      Working...
      X