Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    To begin with, my concern with switching from one control to another is that is very important that the match is as close as possible in "btm". However, I get your point and I think that doing it in the way you suggest is going to be more preferable and convenient down the road.

    Secondly, for what I have understood there must be an observation of btm for every single month even if we are gonna employ only m12 for the match. I can fix that. However, a few of my companies lack "btm" in the first year/s since data about "btm" was not reported at that time, and all the companies lack "btm" in their last sample year since "btm" was not published for that year. So, for example, to elaborate on the case of case 00077R10 and control 95214520:

    Case:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
    413 "00077R10"  -6.122449040412903 1994 "40"    49.381   .3244783 12564 1 12570  6 . . 0
    414 "00077R10"                   0 1994 "40"    49.381   .3244783 12564 1 12600  7 . . 0
    415 "00077R10"  30.434781312942505 1994 "40"     64.41  .24876665 12564 1 12631  8 . . 0
    416 "00077R10"   1.666666753590107 1994 "40"   65.4835  .24468854 12564 1 12662  9 . . 0
    417 "00077R10"  1.6393441706895828 1994 "40"    66.557  .24074194 12564 1 12692 10 . . 0
    418 "00077R10"   6.451612710952759 1994 "40"    70.851  .22615153 12564 1 12723 11 . . 0
    419 "00077R10"  22.727273404598236 1994 "40"   86.9535   .1842716 12564 1 12753 12 . . 0
    420 "00077R10"   4.938271641731262 1995 "40"    91.715  .20684634 12564 1 12784  1 . . 0
    421 "00077R10"    2.35294122248888 1995 "40"    93.873  .20209123 12564 1 12815  2 . . 0
    422 "00077R10"  13.793103396892548 1995 "40"   106.821  .17759533 12564 1 12843  3 . . 0
    423 "00077R10"  2.0202020183205605 1995 "40"   109.181  .17375652 12564 1 12874  4 . . 0
    424 "00077R10"   4.950495064258575 1995 "40"   114.586  .16556047 12564 1 12904  5 . . 0
    425 "00077R10"  11.320754885673523 1995 "40"   127.558  .14872381 12564 1 12935  6 . . 0
    426 "00077R10"   6.779661029577255 1995 "40"   136.458  .13902381 12564 1 12965  7 . . 0
    427 "00077R10"    8.33333358168602 1995 "40"  147.8295  .12832966 12564 1 12996  8 . . 0
    428 "00077R10"  10.989011079072952 1995 "40"  164.0745  .11562376 12564 1 13027  9 . . 0
    429 "00077R10"   16.83168262243271 1995 "40"  192.7235   .0984359 12564 1 13057 10 . . 0
    430 "00077R10"   15.25423675775528 1995 "40"   222.122  .08540762 12564 1 13088 11 . . 0
    431 "00077R10"  29.411765933036804 1995 "40"   287.452 .065996796 12564 1 13118 12 . . 0
    432 "00077R10"   9.090909361839294 1996 "40"   313.584   .5776754 12564 1 13149  1 . . 0
    433 "00077R10"            17.96875 1996 "40"    369.95   .4896601 12564 1 13180  2 . . 0
    434 "00077R10"  23.178808391094208 1996 "40"     455.7   .3975198 12564 1 13209  3 . . 0
    435 "00077R10"   34.40860211849213 1996 "40"  838.5625   .2160242 12564 1 13240  4 . . 0
    436 "00077R10"   .4000000189989805 1996 "40"  841.9167  .21516353 12564 1 13270  5 . . 0
    437 "00077R10" -19.920319318771362 1996 "40"  674.2042   .2686868 12564 1 13301  6 . . 0
    438 "00077R10"   .9950248524546623 1996 "40"   689.591  .26269162 12564 1 13331  7 . . 0
    439 "00077R10"  11.330049484968185 1996 "40"   767.722  .23595753 12564 1 13362  8 . . 0
    440 "00077R10"   27.43362784385681 1996 "40"   978.336   .1851611 12564 1 13393  9 . . 0
    441 "00077R10"   -3.81944440305233 1996 "40"  943.5313   .1919913 12564 1 13423 10 . . 0
    442 "00077R10" -37.906137108802795 1996 "40"   585.875   .3091953 12564 1 13454 11 . . 0
    443 "00077R10"  -8.430232852697372 1996 "40"  536.4844   .3376609 12564 1 13484 12 . . 0
    444 "00077R10"  11.428571492433548 1997 "40"  597.7969   .3246853 12564 1 13515  1 . . 0
    445 "00077R10"   6.552706658840179 1997 "40"  636.9688    .304718 12564 1 13546  2 . . 0
    446 "00077R10"  -22.99465239048004 1997 "40"     490.5   .3957102 12564 1 13574  3 . . 0
    447 "00077R10"   11.80555522441864 1997 "40"  550.9219    .352311 12564 1 13605  4 . . 0
    448 "00077R10"   59.62733030319214 1997 "40"  879.4219  .22070844 12564 1 13635  5 . . 0
    449 "00077R10"  -9.727626293897629 1997 "40"   793.875   .2444917 12564 1 13666  6 . . 0
    450 "00077R10"  -6.034482643008232 1997 "40"   745.996  .26018348 12564 1 13696  7 . . 0
    451 "00077R10"  -1.376146823167801 1997 "40"    735.73  .26381397 12564 1 13727  8 . . 0
    452 "00077R10"  2.7906976640224457 1997 "40"   756.262  .25665158 12564 1 13758  9 . . 0
    453 "00077R10" -14.932127296924591 1997 "40"  643.5945    .301581 12564 1 13788 10 . . 0
    454 "00077R10"  2.1276595070958138 1997 "40"   657.288    .295298 12564 1 13819 11 . . 0
    455 "00077R10"  -.5208333488553762 1997 "40"  653.8646  .29684407 12564 1 13849 12 . . 0
    456 "00077R10"   3.141361102461815 1998 "40"   674.134  .34209135 12564 1 13880  1 . . 0
    457 "00077R10"  15.228426456451416 1998 "40"   776.794  .29688102 12564 1 13911  2 . . 0
    458 "00077R10"  -.8810572326183319 1998 "40"    769.95     .29952 12564 1 13939  3 . . 0
    459 "00077R10"   2.888888865709305 1998 "40"  830.3326  .27773857 12564 1 13970  4 . . 0
    460 "00077R10" -11.447083950042725 1998 "40"  735.2838   .3136414 12564 1 14000  5 . . 0
    461 "00077R10"   -7.31707289814949 1998 "40"  681.4825   .3384025 12564 1 14031  6 . . 0
    462 "00077R10"  -26.31579041481018 1998 "40"   502.285   .4591326 12564 1 14061  7 . . 0
    463 "00077R10" -17.142857611179352 1998 "40"   416.179  .55412555 12564 1 14092  8 . . 0
    464 "00077R10"  -5.603448301553726 1998 "40"  392.8586  .58701885 12564 1 14123  9 . . 0
    465 "00077R10"   37.89954483509064 1998 "40"  541.9579   .4255227 12564 1 14153 10 . . 0
    466 "00077R10"  -9.933774918317795 1998 "40"   488.121   .4724554 12564 1 14184 11 . . 0
    467 "00077R10"  15.441176295280457 1998 "40"  563.4926  .40926075 12564 1 14214 12 . . 0
    468 "00077R10"   16.56050980091095 1999 "40"  656.8099          . 12564 1 14245  1 . . 0
    469 "00077R10"  -22.95081913471222 1999 "40"  506.0666          . 12564 1 14276  2 . . 0
    470 "00077R10" -1.4184396713972092 1999 "40"  498.8884          . 12564 1 14304  3 . . 0
    471 "00077R10"   .7194244768470526 1999 "40"  502.4775          . 12564 1 14335  4 . . 0
    472 "00077R10"  44.285714626312256 1999 "40"  725.0032          . 12564 1 14365  5 . . 0
    473 "00077R10"   256.6831588745117 1999 "40" 2585.9646          . 12564 1 14396  6 . . 0
    end
    format %tm date
    format %td IPOdate
    format %d fulldate

    Control:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
    413 "95214520"   2.063492126762867 1994 "40"    42.016  .6427801 . 0 12570  6 1 1 0
    414 "95214520"            -2.34375 1994 "40"  41.03125  .6582068 . 0 12600  7 1 1 0
    415 "95214520"   3.999999910593033 1994 "40"   42.6725  .6328911 . 0 12631  8 1 1 0
    416 "95214520"  1.9999999552965164 1994 "40"   43.3455  .6230646 . 0 12662  9 1 1 0
    417 "95214520"   1.515151560306549 1994 "40"  44.00225  .6137651 . 0 12692 10 1 1 0
    418 "95214520" -10.447761416435242 1994 "40"    39.405   .685371 . 0 12723 11 1 1 0
    419 "95214520" -2.8333332389593124 1994 "40"   38.0045  .7106276 . 0 12753 12 1 1 0
    420 "95214520"                   0 1995 "40"   38.0045  1.399783 . 0 12784  1 1 1 0
    421 "95214520"   -5.17241396009922 1995 "40"  36.03875  1.476135 . 0 12815  2 1 1 0
    422 "95214520"   5.454545468091965 1995 "40"   63.2925  .8405112 . 0 12843  3 1 1 0
    423 "95214520"  -2.931034378707409 1995 "40"     61.11  .8705295 . 0 12874  4 1 1 0
    424 "95214520"                   0 1995 "40"     61.11  .8705295 . 0 12904  5 1 1 0
    425 "95214520"    5.35714291036129 1995 "40"    64.428  .8256977 . 0 12935  6 1 1 0
    426 "95214520"   9.830508381128311 1995 "40"    70.434  .7552894 . 0 12965  7 1 1 0
    427 "95214520"  3.8759689778089523 1995 "40"    73.164   .727107 . 0 12996  8 1 1 0
    428 "95214520"   5.970149114727974 1995 "40"  85.28875  .6237406 . 0 13027  9 1 1 0
    429 "95214520"   4.225354269146919 1995 "40"   88.5405   .600833 . 0 13057 10 1 1 0
    430 "95214520"   2.985074557363987 1995 "40"   91.1835 .58341753 . 0 13088 11 1 1 0
    431 "95214520" -3.6231882870197296 1995 "40"   79.8665  .6660872 . 0 13118 12 1 1 0
    432 "95214520"  3.3984962850809097 1996 "40"   82.2685  .8161725 . 0 13149  1 1 1 0
    433 "95214520"    9.48905125260353 1996 "40"    90.075  .7454376 . 0 13180  2 1 1 0
    434 "95214520"   3.999999910593033 1996 "40"    93.873   .715278 . 0 13209  3 1 1 0
    435 "95214520"  -4.153846204280853 1996 "40"  89.66075  .7488816 . 0 13240  4 1 1 0
    436 "95214520"                   0 1996 "40"  89.66075  .7488816 . 0 13270  5 1 1 0
    437 "95214520"  2.0134227350354195 1996 "40"    91.523  .7336439 . 0 13301  6 1 1 0
    438 "95214520"  -.3157894825562835 1996 "40"  90.92088  .7385025 . 0 13331  7 1 1 0
    439 "95214520"   .6622516550123692 1996 "40"    91.523  .7336439 . 0 13362  8 1 1 0
    440 "95214520"  3.2894738018512726 1996 "40" 130.83987 .51318675 . 0 13393  9 1 1 0
    441 "95214520"   8.662420511245728 1996 "40"   141.678  .4739289 . 0 13423 10 1 1 0
    442 "95214520"   4.411764815449715 1996 "40"  147.9285  .4539037 . 0 13454 11 1 1 0
    443 "95214520"    2.11267601698637 1996 "40" 121.49187  .5526731 . 0 13484 12 1 1 0
    444 "95214520"  12.744827568531036 1997 "40" 136.57362  .5804199 . 0 13515  1 1 1 0
    445 "95214520"  6.1349693685770035 1997 "40" 144.95238  .5468696 . 0 13546  2 1 1 0
    446 "95214520"                   0 1997 "40"  145.0605   .546462 . 0 13574  3 1 1 0
    447 "95214520"   4.323699325323105 1997 "40"    150.93  .5252107 . 0 13605  4 1 1 0
    448 "95214520"   11.11111119389534 1997 "40"     167.7  .4726896 . 0 13635  5 1 1 0
    449 "95214520"                  25 1997 "40"  209.9375  .3775888 . 0 13666  6 1 1 0
    450 "95214520" -7.0079997181892395 1997 "40"   194.822  .4068845 . 0 13696  7 1 1 0
    451 "95214520"  -9.482758492231369 1997 "40"  176.3475 .44951046 . 0 13727  8 1 1 0
    452 "95214520"  18.095238506793976 1997 "40"   208.258  .3806339 . 0 13758  9 1 1 0
    453 "95214520" -4.5967742800712585 1997 "40"   198.181  .3999881 . 0 13788 10 1 1 0
    454 "95214520"    34.7457617521286 1997 "40"  267.0405 .29684654 . 0 13819 11 1 1 0
    455 "95214520"  -4.716981202363968 1997 "40"  257.2975  .3080871 . 0 13849 12 1 1 0
    456 "95214520"   6.138613820075989 1998 "40"  272.5825   .430053 . 0 13880  1 1 1 0
    457 "95214520"   .9345794096589088 1998 "40"    275.13   .426071 . 0 13911  2 1 1 0
    458 "95214520"  -6.018518656492233 1998 "40" 325.86575  .3597338 . 0 13939  3 1 1 0
    459 "95214520"  3.1527094542980194 1998 "40" 335.49725  .3494065 . 0 13970  4 1 1 0
    460 "95214520"  -3.349282220005989 1998 "40"  324.2605  .3615147 . 0 14000  5 1 1 0
    461 "95214520" -2.4752475321292877 1998 "40"  316.8745 .36994115 . 0 14031  6 1 1 0
    462 "95214520"  -8.934009820222855 1998 "40"  287.9215  .4071419 . 0 14061  7 1 1 0
    463 "95214520" -22.905027866363525 1998 "40"   221.973 .52810436 . 0 14092  8 1 1 0
    464 "95214520" -2.8985507786273956 1998 "40"   215.539  .5438687 . 0 14123  9 1 1 0
    465 "95214520"   21.03283852338791 1998 "40"  260.0981   .450695 . 0 14153 10 1 1 0
    466 "95214520"  17.006802558898926 1998 "40"  304.3325   .385187 . 0 14184 11 1 1 0
    467 "95214520" -2.3255813866853714 1998 "40"   297.717  .3937461 . 0 14214 12 1 1 0
    468 "95214520" -2.7142856270074844 1999 "40" 288.85638 .40432665 . 0 14245  1 1 1 0
    469 "95214520" -3.0674846842885017 1999 "40" 279.99576  .4171218 . 0 14276  2 1 1 0
    470 "95214520"  -5.379746854305267 1999 "40" 264.93268  .4408378 . 0 14304  3 1 1 0
    471 "95214520"   -7.39799365401268 1999 "40" 244.55325  .4775742 . 0 14335  4 1 1 1
    472 "95214520"  1.0869565419852734 1999 "40" 247.21144   .472439 . 0 14365  5 1 1 0
    473 "95214520"    .358422938734293 1999 "40"  248.0975  .4707517 . 0 14396  6 1 1 0
    end
    format %tm date
    format %td IPOdate
    format %d fulldate


    So in this case we have that, since 95214520 trades from much earlier and much longer than 00077R10, "btm" is not absent at all for the former. But as you can see for 00077R10, in the 6 last months of trading during 1999 there is no "btm "available. Thus, I think that with the controls there is no trouble: we can simply choose controls that run at least as long as does the case with "btm" available for all the months during the case running period, and drop the controls' observations with a missing(btm) from the beginning and the end of their samples so that they cause no trouble to the algorithm. However, I still need to have into account the observations of the case during the first and last few months even if "btm" is not available in those, knowing that those "btm" observations will not be needed for the match. Do you think we can somehow work around this?



    For the sake of completeness, I add another control ...

    2nd control:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
    413 "01852J10"  -6.435643881559372 1994 "40"  49.94325  .9353796 . 0 12570  6 1 1 0
    414 "01852J10" -1.0582010261714458 1994 "40"  49.41475  .9453837 . 0 12600  7 1 1 0
    415 "01852J10"  -.5347593687474728 1994 "40"   49.1505  .9504663 . 0 12631  8 1 1 0
    416 "01852J10"   5.376344174146652 1994 "40"   51.9155  .8998449 . 0 12662  9 1 1 0
    417 "01852J10"  -5.102040991187096 1994 "40"  49.26675  .9482237 . 0 12692 10 1 1 0
    418 "01852J10"  -7.526881992816925 1994 "40"   45.5585 1.0254047 . 0 12723 11 1 1 0
    419 "01852J10"  1.1627906933426857 1994 "40"   46.3275 1.0083838 . 0 12753 12 1 1 0
    420 "01852J10"   1.149425283074379 1995 "40"     46.86 1.1091974 . 0 12784  1 1 1 0
    421 "01852J10"  2.2727273404598236 1995 "40"    47.925 1.0845486 . 0 12815  2 1 1 0
    422 "01852J10"  -6.666667014360428 1995 "40"     44.73 1.1620164 . 0 12843  3 1 1 0
    423 "01852J10"    7.14285746216774 1995 "40"    47.925 1.0845486 . 0 12874  4 1 1 0
    424 "01852J10" -1.1111111380159855 1995 "40"   47.3925 1.0967345 . 0 12904  5 1 1 0
    425 "01852J10"   4.494382068514824 1995 "40"  49.54575 1.0490706 . 0 12935  6 1 1 0
    426 "01852J10"   7.526881992816925 1995 "40"    53.275  .9756356 . 0 12965  7 1 1 0
    427 "01852J10"  14.000000059604645 1995 "40"   60.7335  .8558208 . 0 12996  8 1 1 0
    428 "01852J10"  -2.631578966975212 1995 "40"     74.37  .6988972 . 0 13027  9 1 1 0
    429 "01852J10" -3.6036036908626556 1995 "40"     71.69  .7250243 . 0 13057 10 1 1 0
    430 "01852J10"  1.0514019057154655 1995 "40"  72.44375  .7174807 . 0 13088 11 1 1 0
    431 "01852J10"  -.5780346691608429 1995 "40"   57.6415  .9017287 . 0 13118 12 1 1 0
    432 "01852J10"   3.488372266292572 1996 "40"  59.65225  .9299049 . 0 13149  1 1 1 0
    433 "01852J10"                   0 1996 "40"  59.65225  .9299049 . 0 13180  2 1 1 0
    434 "01852J10"  -5.617977678775787 1996 "40"     56.49  .9819599 . 0 13209  3 1 1 0
    435 "01852J10"                   0 1996 "40"     56.49  .9819599 . 0 13240  4 1 1 0
    436 "01852J10"   1.785714365541935 1996 "40"  57.49875  .9647326 . 0 13270  5 1 1 0
    437 "01852J10"   18.12865436077118 1996 "40"   67.9225  .8166795 . 0 13301  6 1 1 0
    438 "01852J10"  -7.920791953802109 1996 "40"   62.5425  .8869315 . 0 13331  7 1 1 0
    439 "01852J10"   1.344086043536663 1996 "40"  63.38313  .8751685 . 0 13362  8 1 1 0
    440 "01852J10" -.26525198481976986 1996 "40"   63.3325  .8758681 . 0 13393  9 1 1 0
    441 "01852J10" -1.0638297535479069 1996 "40"  62.65875   .885286 . 0 13423 10 1 1 0
    442 "01852J10"  18.817204236984253 1996 "40"  74.44937  .7450824 . 0 13454 11 1 1 0
    443 "01852J10"  -9.502262622117996 1996 "40"    67.375   .823316 . 0 13484 12 1 1 0
    444 "01852J10"  10.000000149011612 1997 "40"   74.1125 1.7667503 . 0 13515  1 1 1 0
    445 "01852J10"  12.613636255264282 1997 "40"  83.46078   1.56886 . 0 13546  2 1 1 0
    446 "01852J10"  -6.034308671951294 1997 "40"    78.155  1.675367 . 0 13574  3 1 1 0
    447 "01852J10"  -4.310344904661179 1997 "40"  74.78625  1.750834 . 0 13605  4 1 1 0
    448 "01852J10"   5.855855718255043 1997 "40"  79.16563  1.653979 . 0 13635  5 1 1 0
    449 "01852J10"  3.1063830479979515 1997 "40" 161.01813  .8131897 . 0 13666  6 1 1 0
    450 "01852J10"   2.074688859283924 1997 "40" 164.35875  .7966616 . 0 13696  7 1 1 0
    451 "01852J10"  1.2195121496915817 1997 "40" 166.36313  .7870632 . 0 13727  8 1 1 0
    452 "01852J10"   17.39758998155594 1997 "40"   194.485  .6732565 . 0 13758  9 1 1 0
    453 "01852J10"   7.216494530439377 1997 "40"    208.52  .6279411 . 0 13788 10 1 1 0
    454 "01852J10"   .9615384973585606 1997 "40"   210.525  .6219608 . 0 13819 11 1 1 0
    455 "01852J10"   1.371428556740284 1997 "40"   212.583  .6159396 . 0 13849 12 1 1 0
    456 "01852J10"  -.9433962404727936 1998 "40"  210.5775  .8829857 . 0 13880  1 1 1 0
    457 "01852J10"    4.76190485060215 1998 "40"   220.605    .84285 . 0 13911  2 1 1 0
    458 "01852J10"  1.3090909458696842 1998 "40"   222.666  .8350485 . 0 13939  3 1 1 0
    459 "01852J10"   .9009009227156639 1998 "40"   224.672  .8275927 . 0 13970  4 1 1 0
    460 "01852J10"   -2.45535708963871 1998 "40"  219.1555  .8484246 . 0 14000  5 1 1 0
    461 "01852J10" -12.183066457509995 1998 "40" 272.98676  .6811206 . 0 14031  6 1 1 0
    462 "01852J10" -2.6178009808063507 1998 "40"  265.8405  .6994303 . 0 14061  7 1 1 0
    463 "01852J10" -27.956989407539368 1998 "40"  191.5195  .9708511 . 0 14092  8 1 1 0
    464 "01852J10"  10.358209162950516 1998 "40" 210.52237  .8832169 . 0 14123  9 1 1 0
    465 "01852J10"   4.081632569432259 1998 "40" 219.11513  .8485809 . 0 14153 10 1 1 0
    466 "01852J10"  1.9607843831181526 1998 "40"  223.4115   .832262 . 0 14184 11 1 1 0
    467 "01852J10"  1.0384615510702133 1998 "40" 215.77437  .8617192 . 0 14214 12 1 1 0
    468 "01852J10"   1.916932873427868 1999 "40" 219.91063  .6987869 . 0 14245  1 1 1 0
    469 "01852J10"  -.9404388256371021 1999 "40"  217.8425   .705421 . 0 14276  2 1 1 0
    470 "01852J10"  -9.417721629142761 1999 "40"  195.7825   .784905 . 0 14304  3 1 1 0
    471 "01852J10"   16.19718372821808 1999 "40" 227.49374   .675494 . 0 14335  4 1 1 1
    472 "01852J10"  18.484848737716675 1999 "40" 269.54562    .57011 . 0 14365  5 1 1 0
    473 "01852J10"  -4.286444932222366 1999 "40"  256.4475  .5992286 . 0 14396  6 1 1 0
    end
    format %tm date
    format %td IPOdate
    format %d fulldate

    Thank you for your attention.


    Comment


    • #17
      Well, this data still doesn't produce any matches. For example, your case's IPO date is May 1994, but the dates of all the observations are all June 1994 or later, so it is impossible to try to match a control with the case's me in May 1994.

      Rather than asking you to come up with better example data from scratch again, I have written some code that I think is at least within striking distance of correct, though I am relying on my imagination of how the data will look and behave.

      Bear in mind that much of this code is untested as there was not data suitable for running it. So there may be problems.

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
      413 "00077R10"  -6.122449040412903 1994 "40"    49.381   .3244783 12564 1 12570  6 . . 0
      414 "00077R10"                   0 1994 "40"    49.381   .3244783 12564 1 12600  7 . . 0
      415 "00077R10"  30.434781312942505 1994 "40"     64.41  .24876665 12564 1 12631  8 . . 0
      416 "00077R10"   1.666666753590107 1994 "40"   65.4835  .24468854 12564 1 12662  9 . . 0
      417 "00077R10"  1.6393441706895828 1994 "40"    66.557  .24074194 12564 1 12692 10 . . 0
      418 "00077R10"   6.451612710952759 1994 "40"    70.851  .22615153 12564 1 12723 11 . . 0
      419 "00077R10"  22.727273404598236 1994 "40"   86.9535   .1842716 12564 1 12753 12 . . 0
      420 "00077R10"   4.938271641731262 1995 "40"    91.715  .20684634 12564 1 12784  1 . . 0
      421 "00077R10"    2.35294122248888 1995 "40"    93.873  .20209123 12564 1 12815  2 . . 0
      422 "00077R10"  13.793103396892548 1995 "40"   106.821  .17759533 12564 1 12843  3 . . 0
      423 "00077R10"  2.0202020183205605 1995 "40"   109.181  .17375652 12564 1 12874  4 . . 0
      424 "00077R10"   4.950495064258575 1995 "40"   114.586  .16556047 12564 1 12904  5 . . 0
      425 "00077R10"  11.320754885673523 1995 "40"   127.558  .14872381 12564 1 12935  6 . . 0
      426 "00077R10"   6.779661029577255 1995 "40"   136.458  .13902381 12564 1 12965  7 . . 0
      427 "00077R10"    8.33333358168602 1995 "40"  147.8295  .12832966 12564 1 12996  8 . . 0
      428 "00077R10"  10.989011079072952 1995 "40"  164.0745  .11562376 12564 1 13027  9 . . 0
      429 "00077R10"   16.83168262243271 1995 "40"  192.7235   .0984359 12564 1 13057 10 . . 0
      430 "00077R10"   15.25423675775528 1995 "40"   222.122  .08540762 12564 1 13088 11 . . 0
      431 "00077R10"  29.411765933036804 1995 "40"   287.452 .065996796 12564 1 13118 12 . . 0
      432 "00077R10"   9.090909361839294 1996 "40"   313.584   .5776754 12564 1 13149  1 . . 0
      433 "00077R10"            17.96875 1996 "40"    369.95   .4896601 12564 1 13180  2 . . 0
      434 "00077R10"  23.178808391094208 1996 "40"     455.7   .3975198 12564 1 13209  3 . . 0
      435 "00077R10"   34.40860211849213 1996 "40"  838.5625   .2160242 12564 1 13240  4 . . 0
      436 "00077R10"   .4000000189989805 1996 "40"  841.9167  .21516353 12564 1 13270  5 . . 0
      437 "00077R10" -19.920319318771362 1996 "40"  674.2042   .2686868 12564 1 13301  6 . . 0
      438 "00077R10"   .9950248524546623 1996 "40"   689.591  .26269162 12564 1 13331  7 . . 0
      439 "00077R10"  11.330049484968185 1996 "40"   767.722  .23595753 12564 1 13362  8 . . 0
      440 "00077R10"   27.43362784385681 1996 "40"   978.336   .1851611 12564 1 13393  9 . . 0
      441 "00077R10"   -3.81944440305233 1996 "40"  943.5313   .1919913 12564 1 13423 10 . . 0
      442 "00077R10" -37.906137108802795 1996 "40"   585.875   .3091953 12564 1 13454 11 . . 0
      443 "00077R10"  -8.430232852697372 1996 "40"  536.4844   .3376609 12564 1 13484 12 . . 0
      444 "00077R10"  11.428571492433548 1997 "40"  597.7969   .3246853 12564 1 13515  1 . . 0
      445 "00077R10"   6.552706658840179 1997 "40"  636.9688    .304718 12564 1 13546  2 . . 0
      446 "00077R10"  -22.99465239048004 1997 "40"     490.5   .3957102 12564 1 13574  3 . . 0
      447 "00077R10"   11.80555522441864 1997 "40"  550.9219    .352311 12564 1 13605  4 . . 0
      448 "00077R10"   59.62733030319214 1997 "40"  879.4219  .22070844 12564 1 13635  5 . . 0
      449 "00077R10"  -9.727626293897629 1997 "40"   793.875   .2444917 12564 1 13666  6 . . 0
      450 "00077R10"  -6.034482643008232 1997 "40"   745.996  .26018348 12564 1 13696  7 . . 0
      451 "00077R10"  -1.376146823167801 1997 "40"    735.73  .26381397 12564 1 13727  8 . . 0
      452 "00077R10"  2.7906976640224457 1997 "40"   756.262  .25665158 12564 1 13758  9 . . 0
      453 "00077R10" -14.932127296924591 1997 "40"  643.5945    .301581 12564 1 13788 10 . . 0
      454 "00077R10"  2.1276595070958138 1997 "40"   657.288    .295298 12564 1 13819 11 . . 0
      455 "00077R10"  -.5208333488553762 1997 "40"  653.8646  .29684407 12564 1 13849 12 . . 0
      456 "00077R10"   3.141361102461815 1998 "40"   674.134  .34209135 12564 1 13880  1 . . 0
      457 "00077R10"  15.228426456451416 1998 "40"   776.794  .29688102 12564 1 13911  2 . . 0
      458 "00077R10"  -.8810572326183319 1998 "40"    769.95     .29952 12564 1 13939  3 . . 0
      459 "00077R10"   2.888888865709305 1998 "40"  830.3326  .27773857 12564 1 13970  4 . . 0
      460 "00077R10" -11.447083950042725 1998 "40"  735.2838   .3136414 12564 1 14000  5 . . 0
      461 "00077R10"   -7.31707289814949 1998 "40"  681.4825   .3384025 12564 1 14031  6 . . 0
      462 "00077R10"  -26.31579041481018 1998 "40"   502.285   .4591326 12564 1 14061  7 . . 0
      463 "00077R10" -17.142857611179352 1998 "40"   416.179  .55412555 12564 1 14092  8 . . 0
      464 "00077R10"  -5.603448301553726 1998 "40"  392.8586  .58701885 12564 1 14123  9 . . 0
      465 "00077R10"   37.89954483509064 1998 "40"  541.9579   .4255227 12564 1 14153 10 . . 0
      466 "00077R10"  -9.933774918317795 1998 "40"   488.121   .4724554 12564 1 14184 11 . . 0
      467 "00077R10"  15.441176295280457 1998 "40"  563.4926  .40926075 12564 1 14214 12 . . 0
      468 "00077R10"   16.56050980091095 1999 "40"  656.8099          . 12564 1 14245  1 . . 0
      469 "00077R10"  -22.95081913471222 1999 "40"  506.0666          . 12564 1 14276  2 . . 0
      470 "00077R10" -1.4184396713972092 1999 "40"  498.8884          . 12564 1 14304  3 . . 0
      471 "00077R10"   .7194244768470526 1999 "40"  502.4775          . 12564 1 14335  4 . . 0
      472 "00077R10"  44.285714626312256 1999 "40"  725.0032          . 12564 1 14365  5 . . 0
      473 "00077R10"   256.6831588745117 1999 "40" 2585.9646          . 12564 1 14396  6 . . 0
      end
      format %tm date
      format %td IPOdate
      format %d fulldate
      tempfile building
      save `building', replace
      
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
      413 "95214520"   2.063492126762867 1994 "40"    42.016  .6427801 . 0 12570  6 1 1 0
      414 "95214520"            -2.34375 1994 "40"  41.03125  .6582068 . 0 12600  7 1 1 0
      415 "95214520"   3.999999910593033 1994 "40"   42.6725  .6328911 . 0 12631  8 1 1 0
      416 "95214520"  1.9999999552965164 1994 "40"   43.3455  .6230646 . 0 12662  9 1 1 0
      417 "95214520"   1.515151560306549 1994 "40"  44.00225  .6137651 . 0 12692 10 1 1 0
      418 "95214520" -10.447761416435242 1994 "40"    39.405   .685371 . 0 12723 11 1 1 0
      419 "95214520" -2.8333332389593124 1994 "40"   38.0045  .7106276 . 0 12753 12 1 1 0
      420 "95214520"                   0 1995 "40"   38.0045  1.399783 . 0 12784  1 1 1 0
      421 "95214520"   -5.17241396009922 1995 "40"  36.03875  1.476135 . 0 12815  2 1 1 0
      422 "95214520"   5.454545468091965 1995 "40"   63.2925  .8405112 . 0 12843  3 1 1 0
      423 "95214520"  -2.931034378707409 1995 "40"     61.11  .8705295 . 0 12874  4 1 1 0
      424 "95214520"                   0 1995 "40"     61.11  .8705295 . 0 12904  5 1 1 0
      425 "95214520"    5.35714291036129 1995 "40"    64.428  .8256977 . 0 12935  6 1 1 0
      426 "95214520"   9.830508381128311 1995 "40"    70.434  .7552894 . 0 12965  7 1 1 0
      427 "95214520"  3.8759689778089523 1995 "40"    73.164   .727107 . 0 12996  8 1 1 0
      428 "95214520"   5.970149114727974 1995 "40"  85.28875  .6237406 . 0 13027  9 1 1 0
      429 "95214520"   4.225354269146919 1995 "40"   88.5405   .600833 . 0 13057 10 1 1 0
      430 "95214520"   2.985074557363987 1995 "40"   91.1835 .58341753 . 0 13088 11 1 1 0
      431 "95214520" -3.6231882870197296 1995 "40"   79.8665  .6660872 . 0 13118 12 1 1 0
      432 "95214520"  3.3984962850809097 1996 "40"   82.2685  .8161725 . 0 13149  1 1 1 0
      433 "95214520"    9.48905125260353 1996 "40"    90.075  .7454376 . 0 13180  2 1 1 0
      434 "95214520"   3.999999910593033 1996 "40"    93.873   .715278 . 0 13209  3 1 1 0
      435 "95214520"  -4.153846204280853 1996 "40"  89.66075  .7488816 . 0 13240  4 1 1 0
      436 "95214520"                   0 1996 "40"  89.66075  .7488816 . 0 13270  5 1 1 0
      437 "95214520"  2.0134227350354195 1996 "40"    91.523  .7336439 . 0 13301  6 1 1 0
      438 "95214520"  -.3157894825562835 1996 "40"  90.92088  .7385025 . 0 13331  7 1 1 0
      439 "95214520"   .6622516550123692 1996 "40"    91.523  .7336439 . 0 13362  8 1 1 0
      440 "95214520"  3.2894738018512726 1996 "40" 130.83987 .51318675 . 0 13393  9 1 1 0
      441 "95214520"   8.662420511245728 1996 "40"   141.678  .4739289 . 0 13423 10 1 1 0
      442 "95214520"   4.411764815449715 1996 "40"  147.9285  .4539037 . 0 13454 11 1 1 0
      443 "95214520"    2.11267601698637 1996 "40" 121.49187  .5526731 . 0 13484 12 1 1 0
      444 "95214520"  12.744827568531036 1997 "40" 136.57362  .5804199 . 0 13515  1 1 1 0
      445 "95214520"  6.1349693685770035 1997 "40" 144.95238  .5468696 . 0 13546  2 1 1 0
      446 "95214520"                   0 1997 "40"  145.0605   .546462 . 0 13574  3 1 1 0
      447 "95214520"   4.323699325323105 1997 "40"    150.93  .5252107 . 0 13605  4 1 1 0
      448 "95214520"   11.11111119389534 1997 "40"     167.7  .4726896 . 0 13635  5 1 1 0
      449 "95214520"                  25 1997 "40"  209.9375  .3775888 . 0 13666  6 1 1 0
      450 "95214520" -7.0079997181892395 1997 "40"   194.822  .4068845 . 0 13696  7 1 1 0
      451 "95214520"  -9.482758492231369 1997 "40"  176.3475 .44951046 . 0 13727  8 1 1 0
      452 "95214520"  18.095238506793976 1997 "40"   208.258  .3806339 . 0 13758  9 1 1 0
      453 "95214520" -4.5967742800712585 1997 "40"   198.181  .3999881 . 0 13788 10 1 1 0
      454 "95214520"    34.7457617521286 1997 "40"  267.0405 .29684654 . 0 13819 11 1 1 0
      455 "95214520"  -4.716981202363968 1997 "40"  257.2975  .3080871 . 0 13849 12 1 1 0
      456 "95214520"   6.138613820075989 1998 "40"  272.5825   .430053 . 0 13880  1 1 1 0
      457 "95214520"   .9345794096589088 1998 "40"    275.13   .426071 . 0 13911  2 1 1 0
      458 "95214520"  -6.018518656492233 1998 "40" 325.86575  .3597338 . 0 13939  3 1 1 0
      459 "95214520"  3.1527094542980194 1998 "40" 335.49725  .3494065 . 0 13970  4 1 1 0
      460 "95214520"  -3.349282220005989 1998 "40"  324.2605  .3615147 . 0 14000  5 1 1 0
      461 "95214520" -2.4752475321292877 1998 "40"  316.8745 .36994115 . 0 14031  6 1 1 0
      462 "95214520"  -8.934009820222855 1998 "40"  287.9215  .4071419 . 0 14061  7 1 1 0
      463 "95214520" -22.905027866363525 1998 "40"   221.973 .52810436 . 0 14092  8 1 1 0
      464 "95214520" -2.8985507786273956 1998 "40"   215.539  .5438687 . 0 14123  9 1 1 0
      465 "95214520"   21.03283852338791 1998 "40"  260.0981   .450695 . 0 14153 10 1 1 0
      466 "95214520"  17.006802558898926 1998 "40"  304.3325   .385187 . 0 14184 11 1 1 0
      467 "95214520" -2.3255813866853714 1998 "40"   297.717  .3937461 . 0 14214 12 1 1 0
      468 "95214520" -2.7142856270074844 1999 "40" 288.85638 .40432665 . 0 14245  1 1 1 0
      469 "95214520" -3.0674846842885017 1999 "40" 279.99576  .4171218 . 0 14276  2 1 1 0
      470 "95214520"  -5.379746854305267 1999 "40" 264.93268  .4408378 . 0 14304  3 1 1 0
      471 "95214520"   -7.39799365401268 1999 "40" 244.55325  .4775742 . 0 14335  4 1 1 1
      472 "95214520"  1.0869565419852734 1999 "40" 247.21144   .472439 . 0 14365  5 1 1 0
      473 "95214520"    .358422938734293 1999 "40"  248.0975  .4707517 . 0 14396  6 1 1 0
      end
      format %tm date
      format %td IPOdate
      format %d fulldate
      append using `building'
      save `"`building'"', replace
      
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input long date str10 cusip double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month x y z)
      413 "01852J10"  -6.435643881559372 1994 "40"  49.94325  .9353796 . 0 12570  6 1 1 0
      414 "01852J10" -1.0582010261714458 1994 "40"  49.41475  .9453837 . 0 12600  7 1 1 0
      415 "01852J10"  -.5347593687474728 1994 "40"   49.1505  .9504663 . 0 12631  8 1 1 0
      416 "01852J10"   5.376344174146652 1994 "40"   51.9155  .8998449 . 0 12662  9 1 1 0
      417 "01852J10"  -5.102040991187096 1994 "40"  49.26675  .9482237 . 0 12692 10 1 1 0
      418 "01852J10"  -7.526881992816925 1994 "40"   45.5585 1.0254047 . 0 12723 11 1 1 0
      419 "01852J10"  1.1627906933426857 1994 "40"   46.3275 1.0083838 . 0 12753 12 1 1 0
      420 "01852J10"   1.149425283074379 1995 "40"     46.86 1.1091974 . 0 12784  1 1 1 0
      421 "01852J10"  2.2727273404598236 1995 "40"    47.925 1.0845486 . 0 12815  2 1 1 0
      422 "01852J10"  -6.666667014360428 1995 "40"     44.73 1.1620164 . 0 12843  3 1 1 0
      423 "01852J10"    7.14285746216774 1995 "40"    47.925 1.0845486 . 0 12874  4 1 1 0
      424 "01852J10" -1.1111111380159855 1995 "40"   47.3925 1.0967345 . 0 12904  5 1 1 0
      425 "01852J10"   4.494382068514824 1995 "40"  49.54575 1.0490706 . 0 12935  6 1 1 0
      426 "01852J10"   7.526881992816925 1995 "40"    53.275  .9756356 . 0 12965  7 1 1 0
      427 "01852J10"  14.000000059604645 1995 "40"   60.7335  .8558208 . 0 12996  8 1 1 0
      428 "01852J10"  -2.631578966975212 1995 "40"     74.37  .6988972 . 0 13027  9 1 1 0
      429 "01852J10" -3.6036036908626556 1995 "40"     71.69  .7250243 . 0 13057 10 1 1 0
      430 "01852J10"  1.0514019057154655 1995 "40"  72.44375  .7174807 . 0 13088 11 1 1 0
      431 "01852J10"  -.5780346691608429 1995 "40"   57.6415  .9017287 . 0 13118 12 1 1 0
      432 "01852J10"   3.488372266292572 1996 "40"  59.65225  .9299049 . 0 13149  1 1 1 0
      433 "01852J10"                   0 1996 "40"  59.65225  .9299049 . 0 13180  2 1 1 0
      434 "01852J10"  -5.617977678775787 1996 "40"     56.49  .9819599 . 0 13209  3 1 1 0
      435 "01852J10"                   0 1996 "40"     56.49  .9819599 . 0 13240  4 1 1 0
      436 "01852J10"   1.785714365541935 1996 "40"  57.49875  .9647326 . 0 13270  5 1 1 0
      437 "01852J10"   18.12865436077118 1996 "40"   67.9225  .8166795 . 0 13301  6 1 1 0
      438 "01852J10"  -7.920791953802109 1996 "40"   62.5425  .8869315 . 0 13331  7 1 1 0
      439 "01852J10"   1.344086043536663 1996 "40"  63.38313  .8751685 . 0 13362  8 1 1 0
      440 "01852J10" -.26525198481976986 1996 "40"   63.3325  .8758681 . 0 13393  9 1 1 0
      441 "01852J10" -1.0638297535479069 1996 "40"  62.65875   .885286 . 0 13423 10 1 1 0
      442 "01852J10"  18.817204236984253 1996 "40"  74.44937  .7450824 . 0 13454 11 1 1 0
      443 "01852J10"  -9.502262622117996 1996 "40"    67.375   .823316 . 0 13484 12 1 1 0
      444 "01852J10"  10.000000149011612 1997 "40"   74.1125 1.7667503 . 0 13515  1 1 1 0
      445 "01852J10"  12.613636255264282 1997 "40"  83.46078   1.56886 . 0 13546  2 1 1 0
      446 "01852J10"  -6.034308671951294 1997 "40"    78.155  1.675367 . 0 13574  3 1 1 0
      447 "01852J10"  -4.310344904661179 1997 "40"  74.78625  1.750834 . 0 13605  4 1 1 0
      448 "01852J10"   5.855855718255043 1997 "40"  79.16563  1.653979 . 0 13635  5 1 1 0
      449 "01852J10"  3.1063830479979515 1997 "40" 161.01813  .8131897 . 0 13666  6 1 1 0
      450 "01852J10"   2.074688859283924 1997 "40" 164.35875  .7966616 . 0 13696  7 1 1 0
      451 "01852J10"  1.2195121496915817 1997 "40" 166.36313  .7870632 . 0 13727  8 1 1 0
      452 "01852J10"   17.39758998155594 1997 "40"   194.485  .6732565 . 0 13758  9 1 1 0
      453 "01852J10"   7.216494530439377 1997 "40"    208.52  .6279411 . 0 13788 10 1 1 0
      454 "01852J10"   .9615384973585606 1997 "40"   210.525  .6219608 . 0 13819 11 1 1 0
      455 "01852J10"   1.371428556740284 1997 "40"   212.583  .6159396 . 0 13849 12 1 1 0
      456 "01852J10"  -.9433962404727936 1998 "40"  210.5775  .8829857 . 0 13880  1 1 1 0
      457 "01852J10"    4.76190485060215 1998 "40"   220.605    .84285 . 0 13911  2 1 1 0
      458 "01852J10"  1.3090909458696842 1998 "40"   222.666  .8350485 . 0 13939  3 1 1 0
      459 "01852J10"   .9009009227156639 1998 "40"   224.672  .8275927 . 0 13970  4 1 1 0
      460 "01852J10"   -2.45535708963871 1998 "40"  219.1555  .8484246 . 0 14000  5 1 1 0
      461 "01852J10" -12.183066457509995 1998 "40" 272.98676  .6811206 . 0 14031  6 1 1 0
      462 "01852J10" -2.6178009808063507 1998 "40"  265.8405  .6994303 . 0 14061  7 1 1 0
      463 "01852J10" -27.956989407539368 1998 "40"  191.5195  .9708511 . 0 14092  8 1 1 0
      464 "01852J10"  10.358209162950516 1998 "40" 210.52237  .8832169 . 0 14123  9 1 1 0
      465 "01852J10"   4.081632569432259 1998 "40" 219.11513  .8485809 . 0 14153 10 1 1 0
      466 "01852J10"  1.9607843831181526 1998 "40"  223.4115   .832262 . 0 14184 11 1 1 0
      467 "01852J10"  1.0384615510702133 1998 "40" 215.77437  .8617192 . 0 14214 12 1 1 0
      468 "01852J10"   1.916932873427868 1999 "40" 219.91063  .6987869 . 0 14245  1 1 1 0
      469 "01852J10"  -.9404388256371021 1999 "40"  217.8425   .705421 . 0 14276  2 1 1 0
      470 "01852J10"  -9.417721629142761 1999 "40"  195.7825   .784905 . 0 14304  3 1 1 0
      471 "01852J10"   16.19718372821808 1999 "40" 227.49374   .675494 . 0 14335  4 1 1 1
      472 "01852J10"  18.484848737716675 1999 "40" 269.54562    .57011 . 0 14365  5 1 1 0
      473 "01852J10"  -4.286444932222366 1999 "40"  256.4475  .5992286 . 0 14396  6 1 1 0
      end
      format %tm date
      format %td IPOdate
      format %d fulldate
      append using `building'
      
      //    CREATE SEPARATE CASE AND CONTROL FILES
      isid cusip date, sort
      tempfile original
      save `original'
      by cusip (date): gen final_date = date[_N]
      preserve
      gen IPOmonth = mofd(IPOdate)
      format IPOmonth %tm
      keep if IPO & date == IPOmonth & !missing(btm, gsector, me)
      tempfile cases
      save `cases'
      
      restore
      keep if !IPO & !missing(btm, gsector, me)
      tempfile controls
      save `controls'
      
      //    IDENTIFY THE MATCHING MONTH (DEC OF IPO YEAR) FOR EACH CASE
      use `cases', clear
      
      //    NOW PAIR UP EACH CASE WITH ALL ELIGIBLE CONTROLS
      //    BASED ON SECTOR, SIZE, AND BOOK TO MARKET RATIO
      gen low = me*.70
      gen high = me*1.30
      rangejoin me low high using `controls', by(gsector date) ///
          keepusing(cusip me btm final_date)
      keep if final_date_U >= final_date
      gen delta = abs(btm - btm_U)
      set seed 1234 // OR ANY SEED YOU LIKE
      gen double shuffle = runiform()
      by cusip (delta shuffle), sort: keep if _n == 1
      keep cusip cusip_U
      rename cusip cusip_case
      rename cusip_U cusip_ctrl
      
      //    BRING BACK ORIGINAL DATA AS RELEVANT
      gen cusip = cusip_case
      merge 1:m cusip using `original', assert(match using) keep(match) nogenerate
      replace cusip = cusip_ctrl
      merge 1:1 cusip date using `original', keep(match) nogenerate
      drop cusip

      Comment


      • #18
        I am truly mesmerized by your attention to detail. I did not even realize of the mismatch between month(IPOdate) and the first month of each case’s sample. Thank you for spotting that. Nonetheless, that is easy to work around, even if the IPO month is different from the first, if we have no other first recorded observation, we should match based on the first month (in this case it would be June 1994). So I have modified the file as follows:
        Code:
        bysort cusip (date): gen first_date = date[1]
        gen fullfirst_date=dofm(first_date)
        format fullfirst_date %d
        replace IPOdate=fullfirst_date if IPOdate!=fullfirst_date & IPOdate!=.
        After then I have run your code line by line to make sure that I could understand what it does and spot any glitches. It run until the end but Stata complained in two lines:

        1st complain in * NOW PAIR UP EACH CASE WITH ALL ELIGIBLE CONTROLS BASED ON SECTOR, SIZE, AND BOOK TO MARKET RATIO:
        Code:
        rangejoin me low high using `controls', by(gsector date) ///
            keepusing(cusip me btm final_date)
        I simply changed it by the following and worked nicely.
        Code:
        rangejoin me low high using `controls', by(gsector date) keepusing(cusip me btm final_date)
        2nd complain in * BRING BACK ORIGINAL DATA AS RELEVANT:
        Code:
        merge 1:1 cusip date using `original', keep(match) nogenerate
        Error message:
        variables cusip date do not uniquely identify observations in the master data
        r(459);


        Now, regardless of the foregoing it seems to have worked out well. As follows a brief example of the result:


        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input str10(cusip_case cusip_ctrl) long date double(ret year) str2 gsector float(me btm) long IPOdate float(IPO fulldate month first_date fullfirst_date)
        "00077R10" "05945K10" 413 -6.122449040412903 1994 "40"   49.381   .3244783 12570 1 12570  6 413 12570
        "00077R10" "05945K10" 414                  0 1994 "40"   49.381   .3244783 12570 1 12600  7 413 12570
        "00077R10" "05945K10" 415 30.434781312942505 1994 "40"    64.41  .24876665 12570 1 12631  8 413 12570
        "00077R10" "05945K10" 416  1.666666753590107 1994 "40"  65.4835  .24468854 12570 1 12662  9 413 12570
        "00077R10" "05945K10" 417 1.6393441706895828 1994 "40"   66.557  .24074194 12570 1 12692 10 413 12570
        "00077R10" "05945K10" 418  6.451612710952759 1994 "40"   70.851  .22615153 12570 1 12723 11 413 12570
        end
        format %tm date
        format %td IPOdate
        format %d fulldate
        format %d fullfirst_date

        However, my total sample of cases was composed by 7,127 cusips and 663,488 observations in total, and the resulting cusip_case has 5496 distinct cusips and 516,622 observations in total, so I am missing some cases. Do you think that it has anything to do with the 2nd complain?


        Apart from that I think that this is an excellent piece of work. Thank you very much for taking the time to help me.




        Comment


        • #19
          I don't take credit for attention to detail. Stata, like all software, is relentlessly attentive to detail and it complains when the details are not right. So as a Stata programmer, I am forced to attend to the details.

          As for the "variables cusip date do not uniquely identify observations in the master data " message, I am puzzled by that. I would need a good example of your data that produces that error message but otherwise seems to work will to try to troubleshoot that. I am clearly missing something that is going on in the code, because I expected cusip and date to uniquely identify observations at that point.

          The loss of cusips in your final result is not related to the failure of the final -merge-. All the final -merge- does is match up all of the observations on the control with the case to which it is matched; it doesn't affect the number of cases. The shortfall in cases, I believe, reflects some cases for which no control meets all the matching criteria. If you can find an example of a case that is missing but for which you can also find an acceptable match, then there is an error in the code. But I don't think you will find such an example. This is typical in matched-pair studies: there are usually some cases for which no suitable match is found. Either you live with that, or you loosen the match criteria.

          Comment


          • #20
            As soon as I get my hands on a computer within a few days I will play with the algo and the match criteria to see whether we can optimise the number of matches.

            I would like to help you with the error message puzzle. What amount and type of data would you need me to provide?

            Comment


            • #21
              So, run the code on your real data until you get to the error message. Then at that point run:

              Code:
              duplicates report cusip date
              duplicates tag cusip date, gen(flag)
              dataex if flag
              and post all of that output. That will give me a sense of how big the problem is, and it will also show me (some of) the offending observations. Hopefully I'll be able to see something that suggests where the problem is coming from.

              Comment


              • #22
                Actually, disregard #21. I now realize why the 1:1 merging is incorrect. And as a result of thinking about that, I also realized that the final data set would not show you all of the information you need about the matched pairs. So change all of the code from the "// BRING BACK ORIGINAL DATA AS RELEVANT" comment forward to:

                Code:
                //    BRING BACK ORIGINAL DATA AS RELEVANT
                gen cusip = cusip_case
                merge 1:m cusip using `original', assert(match using) keep(match) nogenerate
                ds cusip* date gsector, not
                local other_vars `r(varlist)'
                rename (`other_vars') =_case
                replace cusip = cusip_ctrl
                joinby cusip date using `original'
                drop cusip
                rename (`other_vars') =_ctrl

                Comment


                • #23
                  The code works seamlessly. For the record I want to create a summary that can be of use for other Statalist users.


                  Rules for the match: in #14 + finding only controls whose data runs as far as does that of the case.


                  Final code:

                  Code:
                  * CHECK FOR UNIQUE IDENTIFIERS
                  isid cusip date, sort
                  tempfile original
                  save `original'
                  
                  by cusip (date): gen final_date = date[_N]
                  preserve
                  
                  gen IPOmonth = mofd(IPOdate)
                  format IPOmonth %tm
                  
                  keep if IPO & date == IPOmonth & !missing(btm, gsector, me)
                  tempfile cases
                  save `cases'
                  
                  restore
                  keep if !IPO & !missing(btm, gsector, me)
                  tempfile controls
                  save `controls'
                  
                  * IDENTIFY THE MATCHING MONTH (DEC OF IPO YEAR) FOR EACH CASE
                  use `cases', clear
                  
                  * NOW PAIR UP EACH CASE WITH ALL ELIGIBLE CONTROLS BASED ON SECTOR, SIZE, AND BOOK TO MARKET RATIO
                  gen low = me*.70
                  gen high = me*1.30
                  rangejoin me low high using `controls', by(gsector date) keepusing(cusip me btm final_date)
                  
                  keep if final_date_U >= final_date
                  gen delta = abs(btm - btm_U)
                  set seed 1234
                  gen double shuffle = runiform()
                  by cusip (delta shuffle), sort: keep if _n == 1
                  keep cusip cusip_U
                  rename cusip cusip_case
                  rename cusip_U cusip_ctrl
                  
                  * BRING BACK ORIGINAL DATA AS RELEVANT
                  gen cusip = cusip_case
                  merge 1:m cusip using `original', assert(match using) keep(match) nogenerate
                  ds cusip* date gsector, not
                  local other_vars `r(varlist)'
                  rename (`other_vars') =_case
                  replace cusip = cusip_ctrl
                  joinby cusip date using `original'
                  drop cusip
                  rename (`other_vars') =_ctrl

                  Initial appearance of data: in #16 with correction of IPOdate in #18


                  Final appeareance of data:

                  Code:
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input str10(cusip_case cusip_ctrl) long date double(ret_case year_case) str2 gsector float(me_case btm_case) long IPOdate_case float IPO_case double(ret_ctrl year_ctrl) float(me_ctrl btm_ctrl)
                  "00077R10" "05945K10" 413  -6.122449040412903 1994 "40"   49.381   .3244783 12570 1   18.18181872367859 1994    37.778  .3133632
                  "00077R10" "05945K10" 414                   0 1994 "40"   49.381   .3244783 12570 1 -3.8461539894342422 1994    36.325  .3258977
                  "00077R10" "05945K10" 415  30.434781312942505 1994 "40"    64.41  .24876665 12570 1  -21.99999988079071 1994   28.3335  .4178176
                  "00077R10" "05945K10" 416   1.666666753590107 1994 "40"  65.4835  .24468854 12570 1 -10.256410390138626 1994  25.59375  .4625439
                  "00077R10" "05945K10" 417  1.6393441706895828 1994 "40"   66.557  .24074194 12570 1  11.428571492433548 1994  28.51875 .41510355
                  "00077R10" "05945K10" 418   6.451612710952759 1994 "40"   70.851  .22615153 12570 1  -33.33333432674408 1994   19.0125  .6226553
                  "00077R10" "05945K10" 419  22.727273404598236 1994 "40"  86.9535   .1842716 12570 1                   0 1994   18.8175  .6291077
                  "00077R10" "05945K10" 420   4.938271641731262 1995 "40"   91.715  .20684634 12570 1  3.8461539894342422 1995  19.54125  .7015798
                  "00077R10" "05945K10" 421    2.35294122248888 1995 "40"   93.873  .20209123 12570 1 -3.7037037312984467 1995   18.8175  .7285637
                  "00077R10" "05945K10" 422  13.793103396892548 1995 "40"  106.821  .17759533 12570 1 -7.6923079788684845 1995     17.37  .7892773
                  "00077R10" "05945K10" 423  2.0202020183205605 1995 "40"  109.181  .17375652 12570 1   -8.33333358168602 1995   15.9225  .8610298
                  "00077R10" "05945K10" 424   4.950495064258575 1995 "40"  114.586  .16556047 12570 1  13.636364042758942 1995  18.09375  .7577062
                  "00077R10" "05945K10" 425  11.320754885673523 1995 "40"  127.558  .14872381 12570 1 -15.999999642372131 1995 15.400875  .8901927
                  "00077R10" "05945K10" 426   6.779661029577255 1995 "40"  136.458  .13902381 12570 1   11.90476194024086 1995 17.234312  .7954913
                  "00077R10" "05945K10" 427    8.33333358168602 1995 "40" 147.8295  .12832966 12570 1  10.638298094272614 1995  19.06775  .7190018
                  "00077R10" "05945K10" 428  10.989011079072952 1995 "40" 164.0745  .11562376 12570 1 -7.6923079788684845 1995    17.601  .7789186
                  "00077R10" "05945K10" 429   16.83168262243271 1995 "40" 192.7235   .0984359 12570 1                   0 1995    17.601  .7789186
                  "00077R10" "05945K10" 430   15.25423675775528 1995 "40"  222.122  .08540762 12570 1  -10.41666641831398 1995 15.767563  .8694905
                  "00077R10" "05945K10" 431  29.411765933036804 1995 "40"  287.452 .065996796 12570 1  -6.976744532585144 1995   14.5175    .94436
                  "00077R10" "05945K10" 432   9.090909361839294 1996 "40"  313.584   .5776754 12570 1                  25 1996 18.146875  .8765761
                  "00077R10" "05945K10" 433            17.96875 1996 "40"   369.95   .4896601 12570 1  -3.999999910593033 1996    17.421  .9131001
                  "00077R10" "05945K10" 434  23.178808391094208 1996 "40"    455.7   .3975198 12570 1    8.33333358168602 1996  18.80125  .8460669
                  "00077R10" "05945K10" 435   34.40860211849213 1996 "40" 838.5625   .2160242 12570 1                   0 1996  18.80125  .8460669
                  "00077R10" "05945K10" 436   .4000000189989805 1996 "40" 841.9167  .21516353 12570 1  7.6923079788684845 1996   20.2475  .7856336
                  "00077R10" "05945K10" 437 -19.920319318771362 1996 "40" 674.2042   .2686868 12570 1   -7.14285746216774 1996   18.7655  .8476788
                  end
                  format %tm date
                  format %td IPOdate_case

                  Some stats for the matching operation:

                  With 0.7 < me/me_control < 1.3

                  # Paired cases: 5,495 out of 7,127
                  Mean "me" dispersion: 15%
                  Mean "btm" dispersion: 88%
                  With a dispersion on "me" of 50% as good as it gets is 50% on "me" and 50% on "btm". Nonetheless, "me" is more relevant for the focus of my research.



                  I can certainly start working on portfolio performance evaluation now that you have helped me to sort this data but, first of all, I would like to asksome final questions about the code. I have run it piece by piece and tried to understand every step along the way. However, I cannot identify the point in which we command Stata to match case-controls on the "btm" from December of the IPO year. For what I understand, I think that happens somewhere around the following lines:

                  Code:
                  keep if IPO & date == IPOmonth & !missing(btm, gsector, me)
                  tempfile cases
                  save `cases'
                  
                  * (...)
                  
                  * IDENTIFY THE MATCHING MONTH (DEC OF IPO YEAR) FOR EACH CASE
                  use `cases', clear
                  
                  * NOW PAIR UP EACH CASE WITH ALL ELIGIBLE CONTROLS BASED ON SECTOR, SIZE, AND BOOK TO MARKET RATIO
                  gen low = me*.70
                  gen high = me*1.30
                  rangejoin me low high using `controls', by(gsector date) keepusing(cusip me btm final_date)
                  However, if we keep the observations whose date == IPOmonth and later on we use only those, how is it possible that we are also matching on "btm" based on December when the IPOdate is June (for example in this case)? Actually, the mean dispersion in "btm" calculated at IPOdate is lower than that at the month(s) of December. Please let me know if I miss something in here.


                  Thank you for your time.

                  Comment


                  • #24
                    In the part of the code where we create the `cases' tempfile, you see:
                    Code:
                    gen IPOmonth = mofd(IPOdate)
                    format IPOmonth %tm
                    
                    keep if IPO & date == IPOmonth & !missing(btm, gsector, me)
                    So the `cases' tempfile contains only the observation that occurs in the case's IPO month. When we do the -rangejoin- with the `controls', the -by(gsector date)- option forces that the pairing of a given control must be from the same month as the case's date, which, as just noted, is the IPO date. So the case-control matching at that point involves only data from the month of the IPO. The next step is refining the match based on closest btm Finally, the -merge- and -joinby- at the end bring back the data from all the other months.

                    Comment


                    • #25
                      So as far as I understand we are doing all the matching based on IPOdate. One of the requirements was that "me" should be matched on IPOdate and "btm" on December of year(IPOdate). Just for the record, the way I have fixed this is by using the following lines at the top of the code:

                      Code:
                      sort cusip date
                      gen btm2=btm if month==12
                      enlarge btm2, by(cusip year)
                      gen btm3=btm2 if IPO & month==12 & year==year(IPOdate)
                      enlarge btm3, by(cusip)
                      replace btm=btm2 if IPO==0
                      replace btm=btm3 if IPO
                      drop btm2 btm3
                      Now the match on "btm" is based only on December of the IPO year.




                      A last question that comes to mind about the code: What do we need the double shuffle command for? Would not it be the same if we simply sort by the dispersion of "btm" and keep the first observation for each "cusip_case"? To test this I have run a code with

                      Code:
                      gen delta = abs(btm - btm_U)
                      by cusip (delta), sort: keep if _n == 1
                      instead of ...

                      Code:
                      gen delta = abs(btm - btm_U)
                      set seed 1234
                      gen double shuffle = runiform()
                      by cusip (delta shuffle), sort: keep if _n == 1
                      And the stats and result are pressumably the same. So what is it actual function?


                      Thank you for your time.

                      Comment

                      Working...
                      X