Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Helping with merge

    Hello,

    I would like to know if it was possible to make a merge on several levels?

    I have two databases that look like this:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 var1 str3 var2 str4 job float w
    "A" "Tal" "Stat" 1500
    "A" "Tal" "Eco"  2100
    "B" "Bor" "Eco"  3500
    "B" "Bor" "Stat" 5400
    "C" "Bor" "Stat" 1000
    end
    
    clear
    input str1 var1 str3 var2 str4 job_2 float age
    "A" "Tal" "Eco"  51
    "A" "Tal" "Stat" 20
    "B" "Bor" "Stat" 10
    "B" "Bor" "Eco"  11
    "C" "Bor" "Eco" 54
    end
    So, i want to merge using var1, var2 and job. For 90% of my database, it works, but i have some people that changed their jobs between the two dataset.
    Is it possible to merge and having something like that :

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 var1 str3 var2 str4 job float w str4 job_2 float age
    "A" "Tal" "Stat" 1500 "Stat" 20
    "A" "Tal" "Eco"  2100 "Eco"  51
    "B" "Bor" "Eco"  3500 "Eco"  11
    "B" "Bor" "Stat" 5400 "Stat" 10
    "C" "Bor" "Stat" 1000 "Eco"  54
    end
    So, for the first four i want merge var1 var2 job.... and for the last one only merge var1 var2

    I don't know if we an do that,
    Thanks !
    Last edited by Raph Selenite; 06 May 2022, 07:36.

  • #2
    -merge- cannot do that, but you can get the result you want from -joinby- and a little subsequent pruning.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str1 var1 str3 var2 str4 job float w
    "A" "Tal" "Stat" 1500
    "A" "Tal" "Eco"  2100
    "B" "Bor" "Eco"  3500
    "B" "Bor" "Stat" 5400
    "C" "Bor" "Stat" 1000
    end
    tempfile file1
    save `file1'
    
    clear
    input str1 var1 str3 var2 str4 job_2 float age
    "A" "Tal" "Eco"  51
    "A" "Tal" "Stat" 20
    "B" "Bor" "Stat" 10
    "B" "Bor" "Eco"  11
    "C" "Bor" "Eco" 54
    end
    tempfile file2
    save `file2'
    
    use `file1', clear
    joinby var1 var2 using `file2'
    by var1 var2 job, sort: gen priority = (job == job_2)
    by var1 var2 job (priority), sort: keep if _n == _N
    drop priority

    Comment


    • #3
      That worked perfectly with my data.
      Thanks !

      Comment

      Working...
      X