Hi,
I want to conduct Propensity Score Matching to create a control group. I want to use the propensity score matching for creating a sample (observations and control group) for further calculations as done in Bena/Li (2014).
Therefore, I'd like to ask you for advise on which of the different PSM commands (psmatch2, teffects psmatch, kmatch) to use.
Additional information are:
- I want to create 2 control groups: 1. Size (SIZE) and industry (SIC) 2. On Size (SIZE), industry (SIC) and book-to-market (BTM)
- My treatment variable is deal (deal) which is 1 if the observation describes a deal and 0 otherwise
- Year (Year) and industry (SIC (2-digit SIC code)) should be exact matches
- I need 5 control firms for each firms in the initial sample (initial = only the deals (deal =1))
- Preferably, I'd want to use the no replace option
This is an example of my data:
I tried the following command, but it leads to significant differences between the control group and treated group in the matched sample:
Additionally, I want to run a regression on the matched sample with deal as the dependent variable (y) and total_acc_freq_5 as the independent variable (x1) + control variables (x2, x3...). But using the following command does not work. I know that the command should be structured like this: reg y x1 x2 t [fweight=_weight], but my treatment variable is also my dependent variable.
So, questions are:
1) Does using psmatch2 make sense?
2) What are steps to get a matched sample where treated and untreated groups do not differ significantly?
3) How do I process a regression with the results of the PSM including the weights?
Best,
Sebastian
I want to conduct Propensity Score Matching to create a control group. I want to use the propensity score matching for creating a sample (observations and control group) for further calculations as done in Bena/Li (2014).
Therefore, I'd like to ask you for advise on which of the different PSM commands (psmatch2, teffects psmatch, kmatch) to use.
Additional information are:
- I want to create 2 control groups: 1. Size (SIZE) and industry (SIC) 2. On Size (SIZE), industry (SIC) and book-to-market (BTM)
- My treatment variable is deal (deal) which is 1 if the observation describes a deal and 0 otherwise
- Year (Year) and industry (SIC (2-digit SIC code)) should be exact matches
- I need 5 control firms for each firms in the initial sample (initial = only the deals (deal =1))
- Preferably, I'd want to use the no replace option
This is an example of my data:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float deal str6(a_cusip_six Target6digitCUSIP) byte SIC_two float(Year BTM_lag1 BV_ln_lag1 ind_year total_acc_freq_5) 1 "000307" "7C7919" 80 2015 .22084355 4.983278 988 2274.3284 1 "000307" "5C7040" 80 2015 .22084355 4.983278 988 2274.3284 1 "000307" "7C1485" 80 2015 .22084355 4.983278 988 2274.3284 0 "00032Q" "" 28 2022 .3129321 5.062025 1358 281.83923 0 "000360" "" 35 2000 .6573898 4.07169 82 1447.444 0 "000360" "" 35 2001 .7536119 4.341439 142 1442.1274 0 "000360" "" 35 2002 .3597853 4.3346076 202 1433.0425 0 "000360" "" 35 2003 .3818808 4.5186644 261 1711.893 0 "000360" "" 35 2004 .4200801 4.625806 319 1985.6843 0 "000360" "" 35 2005 .53020567 4.65612 377 1690.809 0 "000360" "" 35 2006 .5193562 4.732736 435 1391.1727 0 "000360" "" 35 2007 .4010745 4.867965 494 1926.067 0 "000360" "" 35 2008 .3832543 4.921002 552 1907.8893 0 "000360" "" 35 2009 .3916884 4.946936 610 3227.913 0 "000360" "" 35 2010 .4655783 5.051208 668 4255.303 0 "000360" "" 35 2011 .3442122 5.076903 726 4750.8145 0 "000360" "" 35 2012 .3548234 5.18728 784 6022.014 0 "000360" "" 35 2013 .3781444 5.265241 842 6746.429 0 "000360" "" 35 2014 .1836823 5.372701 900 5929.078 0 "000360" "" 35 2015 .1926586 5.45154 958 4858.6914 0 "000360" "" 35 2016 .18916784 5.450412 1016 5600.611 0 "000360" "" 35 2017 .14742124 5.547246 1074 4321.739 0 "000360" "" 35 2018 .15425764 5.692991 1132 5071.652 0 "000360" "" 35 2022 .15583254 6.477249 1365 6219.215 0 "000361" "" 50 2001 1.987912 6.607998 156 320.2961 0 "000361" "" 50 2002 1.8610992 6.553725 216 399.1839 0 "000361" "" 50 2003 1.9479238 6.565545 275 557.50214 0 "000361" "" 50 2004 4.790507 6.531783 333 873.725 0 "000361" "" 50 2005 2.2961338 6.564267 391 713.7247 0 "000361" "" 50 2006 1.400916 6.596095 449 790.8918 0 "000361" "" 50 2007 1.1089821 6.886347 508 1023.9554 1 "000361" "00320W" 50 2008 .8706896 6.973199 566 941.3951 0 "000361" "" 50 2012 1.6228745 7.440574 798 1474.8085 0 "000361" "" 50 2013 4.5244174 7.694235 856 1628.44 0 "000361" "" 50 2014 2.704927 7.667111 914 1629.7904 1 "000361" "9E4796" 50 2015 2.2880285 7.695985 972 1944.0752 0 "000361" "" 50 2019 .982753 7.329553 1204 1736.794 0 "000361" "" 50 2020 1.4494097 7.324622 1262 1642.347 0 "000361" "" 50 2021 2.9368286 7.639642 1320 1812.9552 0 "000361" "" 50 2022 1.0425171 7.339343 1379 2066.439 0 "000361" "" 50 2023 .9222679 7.361312 1428 . 0 "000752" "" 34 2000 1.2341483 6.117703 81 482.5416 0 "00081T" "" 27 2006 1.490755 7.565016 427 790.8918 0 "00081T" "" 27 2007 1.2994742 7.522725 486 1023.9554 0 "00081T" "" 27 2008 2.1877666 7.54882 544 941.3951 0 "00081T" "" 27 2009 6.839888 7.156332 602 937.736 0 "00081T" "" 27 2010 2.785915 7.009228 660 934.5398 0 "00081T" "" 27 2011 2.456704 7.047169 718 1087.9885 0 "00081T" "" 27 2012 2.0859506 7.018133 776 1474.8085 0 "00081T" "" 27 2013 3.019617 7.827121 834 1628.44 0 "00081T" "" 27 2014 3.1197054 7.776073 892 1629.7904 0 "00081T" "" 27 2015 2.2080333 7.708141 950 1944.0752 0 "00081T" "" 27 2016 2.5934224 7.577327 1008 2027.8108 0 "00081T" "" 27 2017 1.46607 7.632643 1066 1643.2063 1 "00081T" "4H9024" 27 2018 2.1505983 7.937053 1124 1886.0978 0 "00081T" "" 27 2022 3.905836 8.036347 1357 2066.439 0 "00086T" "" 59 2000 2.1053352 4.506642 105 1067.537 0 "00086T" "" 59 2001 1.7689255 4.6764855 165 941.9887 0 "00086T" "" 59 2002 .5621461 4.818756 225 935.3834 0 "00086T" "" 59 2003 .8227532 5.281466 284 930.0809 0 "00086T" "" 59 2004 .6217065 5.445849 342 810.6956 0 "00086T" "" 59 2005 .5370531 5.717396 400 577.9028 0 "00086T" "" 59 2006 1.084691 5.745427 458 577.2553 0 "00086T" "" 59 2007 .7547431 5.798599 517 576.17725 0 "00086T" "" 59 2008 1.1532676 5.77421 575 688.7772 0 "00086T" "" 59 2009 10.251554 5.682715 633 799.4882 0 "00086T" "" 59 2010 3.633331 5.581498 691 568.2173 0 "00087B" "" 50 2001 2.8610985 3.803145 156 515.9566 0 "00087B" "" 50 2002 9.084736 3.999704 216 507.13855 0 "00087B" "" 50 2003 11.52586 4.036539 275 544.6887 0 "00087B" "" 50 2004 13.02537 3.9651465 333 401.8929 0 "00087B" "" 50 2005 3.791897 4.0729 391 307.3124 0 "00087B" "" 50 2006 1.5377173 4.215145 449 299.66293 0 "00087B" "" 50 2007 1.810381 4.331207 508 335.68335 1 "000886" "11138X" 36 2000 .2335486 7.422092 83 810.7519 1 "000886" "45052K" 36 2000 .2335486 7.422092 83 810.7519 1 "000886" "02152K" 36 2000 .2335486 7.422092 83 810.7519 1 "000886" "695934" 36 2000 .2335486 7.422092 83 810.7519 1 "000886" "152317" 36 2000 .2335486 7.422092 83 810.7519 1 "000886" "19974H" 36 2001 .2411455 8.286647 143 802.7621 0 "000886" "" 36 2005 .7976782 7.2641 378 976.6392 0 "000886" "" 36 2006 .7550697 7.336286 436 968.3251 1 "000886" "50175R" 36 2007 .9608069 7.384859 495 1152.2501 1 "000886" "15618J" 36 2007 .9608069 7.384859 495 1152.2501 0 "00088E" "" 73 2000 .0818547 3.337618 108 421.0132 1 "00088E" "45723V" 73 2001 1.0846641 5.464476 168 419.22485 0 "00088U" "" 73 2000 .12377501 2.0458848 108 972.3721 1 "00088U" "46010P" 73 2001 .689868 1.9194195 168 954.9915 1 "00088U" "65423N" 73 2002 .4505208 2.2885876 228 940.2632 1 "00088U" "68286F" 73 2006 .3305296 3.085573 461 1310.6427 0 "000899" "" 28 2015 .26232374 3.304209 951 1127.3561 0 "000899" "" 28 2016 .27430806 3.166108 1009 1239.6344 0 "000899" "" 28 2017 .3589648 3.164842 1067 1125.0939 0 "000899" "" 28 2018 .742564 4.6823072 1125 787.2484 0 "000899" "" 28 2019 .802248 4.487242 1183 1237.1703 0 "000899" "" 28 2020 .5356342 4.844903 1241 1186.4073 0 "000899" "" 28 2021 1.0152136 5.335965 1299 1294.7832 0 "000899" "" 28 2022 1.0005624 5.621317 1358 1511.6017 0 "00089C" "" 38 2001 .55929387 4.889371 145 1250.4623 0 "00089C" "" 38 2002 .6786637 4.988437 205 934.9863 end
Code:
ssc install psmatch2, replace logit deal i.SIC_two i.Year BTM_lag1 BV_ln_lag1 predict double pscore if e(sample) gen double pscore2 = ind_year*100+pscore psmatch2 deal (pscore2), out (total_acc_freq_5) neighbor (5) caliper(0.05) common //noreplace psmatch2 deal (pscore2), out (total_acc_freq_5) neighbor (5) common pstest BTM_lag1 BV_ln_lag1, both
regress deal total_acc_freq_5 BV_ln BTM LEV ROA REV R_D [fweight=_weight]
1) Does using psmatch2 make sense?
2) What are steps to get a matched sample where treated and untreated groups do not differ significantly?
3) How do I process a regression with the results of the PSM including the weights?
Best,
Sebastian