Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • matched sample

    Hello!

    I need help with creating a matched sample.

    I would like to match firms (var:ID) that switch offices (Change=1) to firms in the same industry (SICCategory), year (Year), country (Country) and closest in size (SIZE) that do not switch offices (Change=0)

    Appreciate your support

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte ID str7 Country int Year byte(Change Industry) double SIZE
     1 "AUSTRIA" 2013 0 2 13.09431
     2 "AUSTRIA" 2013 0 9 9.393087
     3 "BELGIUM" 2013 1 6 15.40521
     4 "BELGIUM" 2013 0 9 11.33719
     5 "BELGIUM" 2013 0 9 10.41446
     6 "AUSTRIA" 2014 0 2 13.03728
     7 "BELGIUM" 2014 1 9 11.08366
     8 "BELGIUM" 2014 0 9 10.62015
     9 "BELGIUM" 2014 0 9 10.40085
    10 "AUSTRIA" 2013 0 9 14.63176
    11 "BELGIUM" 2013 0 9 12.88997
    12 "BELGIUM" 2013 0 2 15.30626
    13 "BELGIUM" 2013 0 6 12.16089
    14 "BELGIUM" 2013 0 9 12.16911
    15 "BELGIUM" 2013 0 6 12.71264
    16 "BELGIUM" 2013 0 2 17.05595
    17 "AUSTRIA" 2014 0 9 14.52294
    18 "BELGIUM" 2014 1 9 12.70191
    19 "BELGIUM" 2014 0 6 12.55162
    20 "BELGIUM" 2014 0 6 12.03191
    21 "BELGIUM" 2014 0 2 16.89398
    22 "BELGIUM" 2014 0 9 11.88911
    23 "BELGIUM" 2014 0 2 15.34024
    24 "AUSTRIA" 2013 0 9 13.45583
    25 "AUSTRIA" 2013 0 2 17.59792
    26 "BELGIUM" 2013 0 9 16.52456
    27 "BELGIUM" 2013 0 9 10.82559
    28 "BELGIUM" 2013 0 9 10.74437
    29 "AUSTRIA" 2014 0 2 17.53405
    30 "AUSTRIA" 2014 0 9 13.30309
    31 "BELGIUM" 2014 0 9 10.65528
    32 "BELGIUM" 2014 0 9 16.44831
    33 "BELGIUM" 2013 0 9 10.82559
    34 "AUSTRIA" 2013 0 9 9.825234
    35 "BELGIUM" 2013 0 9 15.08007
    36 "BELGIUM" 2013 0 9 12.96676
    37 "AUSTRIA" 2014 0 9 9.396856
    38 "BELGIUM" 2014 0 9 14.94482
    39 "BELGIUM" 2014 0 6 15.22952
    40 "BELGIUM" 2014 0 9 12.95127
    41 "AUSTRIA" 2013 0 6 10.96973
    42 "AUSTRIA" 2013 0 9 10.54082
    43 "AUSTRIA" 2013 0 9 10.27981
    44 "AUSTRIA" 2013 0 6 10.74546
    45 "BELGIUM" 2013 0 9  14.5234
    46 "AUSTRIA" 2014 0 6 10.36204
    47 "AUSTRIA" 2014 0 9 9.876927
    48 "AUSTRIA" 2014 0 6 14.04012
    49 "AUSTRIA" 2014 0 9 10.31835
    50 "BELGIUM" 2014 0 9 14.48282
    end
    ------------------ copy up to and including the previous line ------------------

  • #2
    Code:
    preserve
    keep if Change
    drop Change
    rename (ID SIZE) =1
    tempfile cases
    save `cases'
    
    restore
    keep if !Change
    drop Change
    rename (ID SIZE) =0
    tempfile controls
    save `controls'
    
    use `cases', clear
    joinby Country Year Industry using `controls'
    gen delta = abs(SIZE1-SIZE0)
    by ID1 (delta), sort: keep if _n == 1
    
    gen `c(obs_t)' pair_num = _n
    reshape long ID SIZE, i(pair_num) j(Change)
    Note: In the event that there are two non-change IDs that match an ID on Country, Year, and Industry and they are tied for closest in SIZE, this code picks one at random.

    Added: I have assumed here that by "closest in size" you mean that the difference between the sizes is closest to zero. Though it is, in my experience, less common, I have seen people also do matching where they chose the one such that the ratio of the sizes is closest to one. That can produce different choices.
    Last edited by Clyde Schechter; 07 Jul 2023, 14:28.

    Comment


    • #3
      Hi Clyde, it works perfectly
      Thank you so much for your help

      Comment

      Working...
      X