Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching firms based

    Dear all,

    I am currently writing a code to capture the differences in earnings management between US firms and cross-listed firms (foreign firms on an American stock exchange). Because the cross-listed firms are self-selected, the data might be biased. Therefore, I have to match the cross-listed firms with US firms based on:

    - mtb (market-to-book ratio)
    - roa (return on assets)
    - at (total assets)


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long gvkey double fyear float(dummy_usa dummy_foreign) double    at    float(roa    mtb)
    1004 1996 1 0  529.584    .04347752          .
    1004 1997 1 0  670.559    .05317504          .
    1004 1998 1 0   726.63    .05734831  1.6586404
    1004 1999 1 0  740.998    .04745357  1.0978953
    1004 2000 1 0  701.854    .02640293  1.1084794
    1004 2001 1 0  710.199   -.08298942  1.1752149
    1004 2002 1 0  686.621  -.018074017   .4858825
    1004 2003 1 0  709.292   .004940137  1.0239426
    1004 2004 1 0   732.23   .021104025  1.6606493
    1004 2005 1 0  978.819   .035923906  2.0879886
    1004 2006 1 0 1067.633    .05494398  2.4809506
    1004 2007 1 0  1362.01     .0551714  1.2772952
    1004 2008 1 0 1377.511    .05709646   .8701464
    1004 2009 1 0 1501.042   .029731346  1.0421851
    1004 2010 1 0 1703.727    .04098427  1.2568352
    1004 2011 1 0 2195.653    .03084413  .56036645
    1004 2012 1 0   2136.9    .02573822   .8591657
    1004 2013 1 0   2199.5   .033143897   .9606355
    1004 2014 1 0     1515   .006732673  1.2381912
    1004 2015 1 0   1442.1   .033076763   .9731014
    1010 1997 1 0   3181.3   .068022504          .
    1010 1998 1 0   3257.3   .020630583          .
    1010 1999 1 0   3563.4   .021692766          .
    1010 2000 1 0   3794.5   .021056794          .
    1010 2001 1 0   3723.1    .03913406          .
    1010 2002 1 0   3702.5   .021525996          .
    1010 2003 1 0   4832.1    .07294965          .
    1013 1997 1 0  936.303    .11624122          .
    1013 1998 1 0 1300.587    .11281598   3.393175
    1013 1999 1 0 1672.529     .0523967   5.735749
    1013 2000 1 0   3970.5    .21863745   5.652886
    1013 2001 1 0   2499.7   -.51514184   1.903243
    1013 2002 1 0   1144.2   -1.0006992   1.725441
    1013 2003 1 0   1296.9   -.05914103  3.3024726
    1013 2004 1 0   1428.1    .01148379   2.715488
    1013 2005 1 0     1535    .07211726  2.6268575
    1013 2006 1 0   1611.4      .040772  1.9200138
    1013 2007 1 0   1764.8    .06023346  2.1825328
    1013 2008 1 0     1921  -.021811556   .7718683
    1013 2009 1 0   1343.6    -.3530068  2.2617743
    1013 2010 1 0   1474.5    .04204815      2.835
    1019 1997 1 0    26.71     .0426432          .
    1019 1998 1 0   29.283    .05624424   2.919797
    1019 1999 1 0   29.341    .03489997   2.787749
    1019 2000 1 0   28.638    .06152664  2.3041565
    1019 2001 1 0   30.836    .04183422   3.302927
    1021 1997 1 0   20.516    .07550205          .
    1021 1998 1 0   18.661   -.17833985  1.0966128
    1021 1999 1 0   13.986   -.15780066   .6230607
    1021 2000 1 0   11.608   -.06960717  1.0197082
    1021 2001 1 0    8.635    -.2012739  1.0919029
    1021 2002 1 0     7.85   .010700637  .55830675
    1021 2003 1 0    6.044   -.25066182   1.194015
    1021 2004 1 0    6.245     .2153723   5.218199
    1021 2005 1 0    8.153    .23304304   4.236929
    1021 2006 1 0   14.341     .0700788   2.719383
    1021 2007 1 0   27.171   -.17198484  2.0490286
    1021 2008 1 0   21.401    -.5162843  1.2018434
    1034 1997 1 0  631.866   .027550146          .
    1034 1998 1 0  908.936    .02663664   3.564293
    1034 1999 1 0 1160.266   .031865105   2.587424
    1034 2000 1 0 1610.435   .034467705   2.080925
    1034 2001 1 0 2390.008  -.015863545   1.314704
    1034 2002 1 0 2296.924    -.0433889   .6115829
    1034 2003 1 0 2329.268   .005938776   .9239342
    1034 2004 1 0 2003.842   -.15706678  1.0131919
    1034 2005 1 0 1623.383    .08240138  1.6793386
    1034 2006 1 0  927.239    .08902128   1.434651
    1034 2007 1 0 1288.165  -.010542904   1.206971
    1036 1997 0 1 1778.547    .07757906          .
    1036 1998 0 1  2113.32    .04717128   .9201303
    1036 1999 0 1 2241.575    .03966407   .8469118
    1036 2000 0 1 2325.377    .02431864  .51730186
    1037 1996 1 0    4.969     -.555645          .
    1037 1997 1 0     5.45     .1719266          .
    1037 1998 1 0    3.228    -1.078067  15.407714
    1037 1999 1 0    4.575    -.2450273  72.605804
    1037 2000 1 0    6.373    .18264553   6.106415
    1037 2001 1 0   17.867     .0374993  3.4009595
    1038 1996 1 0  718.213   .026447587          .
    1038 1997 1 0   795.78   -.03078615          .
    1038 1998 1 0   975.73  -.016414378   3.127864
    1038 1999 1 0 1188.805   -.04642225  2.0251205
    1038 2000 1 0 1047.264   -.10110727 -2.8141334
    1038 2001 1 0  1279.17  -.008965189  1.7854867
    1038 2002 1 0 1491.698  -.013609993  1.0782552
    1038 2003 1 0 1506.534  -.007111688  2.0165327
    1043 1997 1 0     44.9  .0022939867          .
    1043 1998 1 0   45.639    .02346677  -1.848265
    1043 1999 1 0    42.21 -.0032693674  -.6050181
    1045 1997 1 0    20915    .04709538          .
    1045 1998 1 0    22303    .05891584    1.43031
    1045 1999 1 0    24374    .04041192  1.4482962
    1045 2000 1 0    26213   .031015145   .8304026
    1045 2001 1 0    32841   -.05365245   .6411717
    1045 2002 1 0    30267   -.11600093  1.0764759
    1045 2003 1 0    29330   -.04186839    44.9258
    1045 2004 1 0    28773  -.026448406 -3.0372775
    1045 2005 1 0    29495   -.02919139  -2.748398
    1045 2006 1 0    29145   .007925888  -11.08553
    end

    I currently have 5,208 cross-listed firms, and I want to reduce the amount of American firms (16,663) to the same amount (total amount of observations is 186,587).

    My question looks similar to the problem discussed here: https://www.statalist.org/forums/for...with-firm-size
    However, if I follow this, I end up with only 326 observations. Moreover, if I follow the commands in the post above, I end up with the American firms in the same row as their matched cross-listed firms. However, what I want is to get rid of all the unmatched American observations, and keep a dataset where the matched US and cross-listed firms are not in the same observation, so that I can still run regressions on them.

    Perhaps what I mean is not exactly called 'matching'. Anyway, I am looking to keep one American firm for each cross-listed firm, that is most similar in terms of mtb, roa and at.

    Also, I might have to sort firms on years first (that is: for observations/firms to be matched, the main criteria is that the observations are in the same year). If that is necessary, how can I adapt the code?

    Thank you in advance.
    Last edited by Hidde van Lent; 15 Mar 2019, 03:31.
Working...
X