Hello,
I have a dataset with equally many treated (ETS == 1) and control firms (ETS == 0) - each treated has been matched to a control. For the treated observations where more than one good match was found, the treated appears once for each good match.
Instead of treated firms appearing once for each good match i want it to appear only once, by duplicates drop / collapse, and then average the matched control observations for the dropped duplicates so that there is still equally many treated and control firms.
For example:
bvd_id is the firm identifier, so in the dataex above row 5 and 6 are duplicates of row 4.
morder_treated and morder_control tells us the matches. If morder_treated == 11 then its match is morder_control == 11 and vice versa. In the example, row 10, 11 and 12 are hence the three matches found for row 4, 5 and 6.
I can't figure out how to tell stata which control observations that should be averaged.
Any suggestions would be appreciated!
I have a dataset with equally many treated (ETS == 1) and control firms (ETS == 0) - each treated has been matched to a control. For the treated observations where more than one good match was found, the treated appears once for each good match.
Instead of treated firms appearing once for each good match i want it to appear only once, by duplicates drop / collapse, and then average the matched control observations for the dropped duplicates so that there is still equally many treated and control firms.
For example:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input float n str29 bvd_id double(ETS incorporation_year green_pat9903 green_pat0412 morder_treated morder_control) 1 "CZ15044572" 1 1991 0 0 11 751.5 2 "CZ15503461" 1 1992 0 0 12 752.5 3 "CZ15504077" 1 1991 0 0 13 753.5 4 "CZ16193679" 1 1991 0 0 14 754.5 5 "CZ16193679" 1 1991 0 0 15 755.5 6 "CZ16193679" 1 1991 0 0 16 756.5 7 "CZ49062905" 0 1993 0 0 651.5 11 8 "CZ60108631" 0 1994 0 0 652.5 12 9 "CZ25137026" 0 1997 0 0 653.5 13 10 "CZ11360097" 0 1991 0 0 654.5 14 11 "CZ12893811" 0 1991 0 0 655.5 15 12 "CZ13009796" 0 1991 0 0 656.5 16 end
bvd_id is the firm identifier, so in the dataex above row 5 and 6 are duplicates of row 4.
morder_treated and morder_control tells us the matches. If morder_treated == 11 then its match is morder_control == 11 and vice versa. In the example, row 10, 11 and 12 are hence the three matches found for row 4, 5 and 6.
I can't figure out how to tell stata which control observations that should be averaged.
Any suggestions would be appreciated!
Comment