Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge with if conditions

    Hello everybody,

    I have a hospital dataset (master.dta) containing information about surgeries performed in separate years. I would like to merge additional information about the cancer certification status of each hospital (using.dta). The datasets contains two differen variables for the geographical information of each hospital: ZIP codes (kh_plz) and municipality keys (hosp_AGS). I want to perform a hierarchical merge. First, I want to merge the data using the hospital zip code (kh_plz) information. For all unsuccessful mergers (_merge = 2), I would like to merge the data using the municipality keys (hosp_AGS). As there is no if condition for merging in Stata, could you suggest a workaround? Please note that the data i present is only a small snippet of the original file, so the results won't make to much sense....

    Kindly,

    Fred

    Code:
    use master.dta, clear                                                        // Upload Data
    merge m:1 kh_plz surgery year using "using.dta"                        // m:1 merge with using data
    gen DKG_Zentrum = 1 if _merge==3                                                              // Generate a variable for a certified hospital if merging was successful
    
    merge m:1 hosp_AGS surgery year using "using.dta"  if _merge==2            // How do can I make this word

    Master File

    Code:
    * Master File
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(kh_plz hosp_AGS) float(surgery year)
    44649  2112000 2 2020
    14165  9170147 2 2020
    23562  5156016 4 2012
    23562  5057119 4 2014
    32756  3115072 1 2017
    32756  5013020 1 2013
    32756  4371000 1 2012
    32756 11189016 1 2013
    23562  5661012 4 2013
    25746     9111 4 2018
    25746 12116032 4 2013
    25746  3226000 4 2019
    25746  5462070 4 2018
    25746  2712001 4 2012
    25746  9662000 4 2018
    32756 11152000 1 2011
    32756 11061000 1 2014
    25746  6225032 4 2020
    23730  5359155 4 2019
    23730 12211000 4 2017
    23730 13074003 4 2017
    23730    55002 4 2013
    23730  9403000 4 2014
    23730  7656000 4 2013
    40489  5401022 5 2011
    40489  8115050 5 2011
    40489 14752000 5 2011
    40489  9303000 5 2013
    28205 14015000 1 2011
    32756 15091000 1 2017
    21339  5072081 4 2011
    40489  8313000 5 2013
    40489  3004006 5 2015
    63755 14271002 5 2020
    28209 16303045 4 2011
    28209  6130000 4 2013
    28209  9434000 4 2011
    28209  8363008 4 2017
    28209  8324028 4 2020
    end
    label values surgery surgery
    label def surgery 1 "Pancreas", modify
    label def surgery 2 "Gastric", modify
    label def surgery 4 "Colon/Rectal", modify
    label def surgery 5 "Lung", modify

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(kh_plz hosp_AGS) float(surgery year)
    44649  2112000 2 2020
    14165  9170147 2 2020
    23562  5156016 4 2012
    23562  5057119 4 2014
    32756  3115072 1 2017
    32756  5013020 1 2013
    32756  4371000 1 2012
    32756 11189016 1 2013
    23562  5661012 4 2013
    25746     9111 4 2018
    25746 12116032 4 2013
    25746  3226000 4 2019
    25746  5462070 4 2018
    25746  2712001 4 2012
    25746  9662000 4 2018
    32756 11152000 1 2011
    32756 11061000 1 2014
    25746  6225032 4 2020
    23730  5359155 4 2019
    23730 12211000 4 2017
    23730 13074003 4 2017
    23730    55002 4 2013
    23730  9403000 4 2014
    23730  7656000 4 2013
    40489  5401022 5 2011
    40489  8115050 5 2011
    40489 14752000 5 2011
    40489  9303000 5 2013
    28205 14015000 1 2011
    32756 15091000 1 2017
    21339  5072081 4 2011
    40489  8313000 5 2013
    40489  3004006 5 2015
    63755 14271002 5 2020
    28209 16303045 4 2011
    28209  6130000 4 2013
    28209  9434000 4 2011
    28209  8363008 4 2017
    28209  8324028 4 2020
    end
    label values surgery surgery
    label def surgery 1 "Pancreas", modify
    label def surgery 2 "Gastric", modify
    label def surgery 4 "Colon/Rectal", modify
    label def surgery 5 "Lung", modify
    ------------------ copy up to and including the previous line ------------------

    Listed 39 out of 39 observations



    Using-File
    ----------------------- copy starting from the next line -----------------------
    Code:
    * Using-File
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(kh_plz hosp_AGS) float(surgery year dkg_certification_date)
    24939 1001000 4 2013 19391
    24105 1002000 1 2012 19247
    24105 1002000 2 2020 21943
    24105 1002000 4 2012 19247
    24105 1002000 5 2019 21613
    24105 1002000 6 2012 19311
    24116 1002000 1 2020 21937
    24116 1002000 4 2014 19960
    25105 1002000 7 2018 21213
    23560 1003000 4 2015 20388
    23560 1003000 6 2016 20728
    23562 1003000 1 2014 19852
    23562 1003000 2 2019 21880
    23562 1003000 4 2014 19852
    24534 1004000 4 2016 20485
    25746 1051044 4 2013 19516
    23730 1055032 4 2021 22475
    25421 1056039 4 2009 18074
    24768 1058135 4 2012 19054
    24768 1058135 6 2011 18722
    25524 1061046 4 2012 19247
    22927 1062023 5 2012 19179
    21465 1062060 1 2020 22034
    21465 1062060 4 2020 22034
    20246 2000000 1 2021 22465
    20246 2000000 2 2021 22465
    20246 2000000 6 2008 17570
    20251 2000000 4 2014 20013
    20259 2000000 4 2007 17372
    22297 2000000 4 2021 22516
    22307 2000000 1 2010 18610
    22307 2000000 4 2009 18153
    38124 3101000 4 2016 20629
    38126 3101000 1 2016 20489
    38126 3101000 3 2019 21600
    38126 3101000 4 2014 19940
    38126 3101000 5 2018 21228
    38126 3101000 6 2011 18983
    38440 3103000 4 2016 20597
    38440 3103000 6 2010 18613
    38518 3151009 4 2008 17749
    38302 3158037 4 2014 19927
    37075 3159016 1 2018 21263
    37075 3159016 4 2016 20796
    37075 3159016 5 2015 20128
    37075 3159016 6 2018 21259
    30169 3241001 6 2009 18001
    30171 3241001 4 2012 19281
    30459 3241001 1 2016 20656
    30459 3241001 4 2011 18794
    30459 3241001 5 2010 18379
    30459 3241001 6 2011 18752
    30559 3241001 4 2008 17883
    30625 3241001 1 2018 21208
    30625 3241001 2 2019 21593
    30625 3241001 3 2018 21208
    30625 3241001 4 2019 21593
    30625 3241001 5 2021 22308
    30625 3241001 6 2012 19058
    30625 3241001 7 2017 20844
    30938 3241004 6 2011 18759
    30989 3241006 1 2014 20027
    30989 3241006 4 2011 18871
    31134 3254021 1 2020 22198
    31134 3254021 4 2009 18203
    31135 3254021 1 2013 19553
    31135 3254021 4 2007 17455
    29223 3351006 1 2016 20805
    29223 3351006 4 2013 19372
    21244 3353005 4 2010 18394
    21423 3353040 1 2021 22545
    21423 3353040 4 2012 19157
    21339 3355022 1 2014 19822
    21339 3355022 4 2010 18470
    21339 3355022 6 2012 19066
    27356 3357039 3 2016 20629
    27356 3357039 4 2011 18728
    21682 3359038 1 2016 20730
    21682 3359038 4 2007 17381
    26121 3403000 1 2011 18842
    26121 3403000 3 2018 21462
    26121 3403000 4 2011 18842
    26121 3403000 5 2011 18675
    26133 3403000 6 2009 18144
    49074 3404000 1 2020 22008
    49074 3404000 4 2010 18333
    49076 3404000 1 2012 19219
    49076 3404000 4 2011 18744
    26655 3451007 4 2007 17188
    26655 3451007 6 2014 19940
    49808 3454032 4 2015 20118
    49124 3459019 4 2015 20165
    49179 3459029 5 2011 18735
    49377 3460009 4 2010 18493
    28205 4011000 1 2014 19964
    28205 4011000 4 2014 19964
    28205 4011000 6 2012 19109
    28239 4011000 4 2008 17713
    28325 4011000 5 2009 18050
    40217 5111000 1 2012 19064
    end
    format %td dkg_certification_date
    label values surgery surgery
    label def surgery 1 "Pancreas", modify
    label def surgery 2 "Gastric", modify
    label def surgery 3 "Eosophageal", modify
    label def surgery 4 "Kolorektal", modify
    label def surgery 5 "Lung", modify
    label def surgery 6 "Prostata", modify
    label def surgery 7 "Leber", modify

  • #2
    This hasn't been tested, but essentially, you want to merge the unmatched dataset using municipality keys, and then merge it back into your original dataset. In this approach, the 'if' condition is simulated through a series of merges.

    Code:
    use master.dta, clear                                                        
    merge m:1 kh_plz surgery year using "using.dta"                        
    tempfile master
    save `master'
    keep if _merge==2
    merge m:1 hosp_AGS surgery year using "using.dta", nogen
    merge 1:1 kh_plz hosp_AGS surgery year using `master', nogen

    Comment


    • #3
      But since the variables are the same, you could also use the -update- option of merge. Values written during the first merge will not be overwritten by the second. I believe this achieves the same result as #2.

      Code:
      use master.dta, clear
      merge m:1 kh_plz surgery year using "using.dta"
      merge m:1 hosp_AGS surgery year using "using.dta", nogen update

      Comment

      Working...
      X