Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using Propensity Score Matching (PSM) to identify comparable firms in a specific year

    Dear Statalist,
    I would like to use the nearest-neighbour propensity score matching (PSM) method to find, for each company that received Venture Capital (VC) investment in a certain year, a group of non-VC-backed companies (i.e. 10 control group companies for each sample company) that had the most similar probability of receiving capital resources from a venture capitalist.
    The problem is am not sure about the methodology I am following and I would like to have feedback from experts. Before entering into the detail I will briefly explain each variable in the dataset.

    -------------------------------------------------------------------------------------------------------------------------------------------------------------
    VARIABLES:
    -> treatment: dummy equal to 1 if the company is in the treatment group, whereas equal to 0 if the company belong to the set of potential comparables firms. In the dataset, there are more or less 200 treated firms and 50.000 potential comparables firms.
    -> id: identifier of the company
    -> year
    -> T: timeline variable equal to 0 in the event-year (year in which the company in the treatment group received the investment by the VC)
    -> Industry: industry of the company
    -> GeograpicalArea: geographical area in which the company is located
    -> ln_Firm_age: logarithm of firm age
    -> Intangible_ratio: total intangibles over total assets
    -> ln_Total_assets: logarithm of total assets
    -> ln_Revenues: logarithm of revenues
    -> Revenues_growth: growth rate of revenues
    -> profitloss: profit or loss of the company
    -> Employees: number of employees

    In the last part of this post, I also reported a very small extract from my huge dataset.
    -------------------------------------------------------------------------------------------------------------------------------------------------------------

    The procedure I used to identify comparable firms is the following:

    1. I estimated the following model only on my treatment group (treatment=1). This step aims to find a model ables to estimate the probability of receiving VC-support using some proxy variables (Employees ln_Revenues Intangible_ratio ln_Firm_age) according to what happened in reality.
    Code:
    gen VCsupport=0
    replace VCsupport=1 if T=0
    logit VCsupport Employees ln_Revenues Intangible_ratio ln_Firm_age if treatment==1
    estat classification
    The model works, even if the post-estimation outcome classification shows it predicts correctly only 25% of support.

    2. I computed the probability of receiving a VC-investment for all the companies in my dataset (treatment=1 and treatment=0) using the coefficients estimated in the previous step
    Code:
    gen ProbabilityOfSupport=-0.1426623*Employees-0.0245476*ln_Revenues+0.0089229*Intangible_ratio-1.026105*ln_Firm_age
    3. I performed the nearest-neighbour propensity score matching procedure, using:
    -> as dependent variable: the probability estimated in the previous step, i.e. ProbabilityOfSupport
    -> as treatment variable: treatment
    -> as matching variables: ln_Firm_age ln_Revenues Revenues_growth ln_Total_assets i.Industry i.GeograpicalArea

    Code:
    teffects nnmatch (ProbabilityOfSupport ln_Firm_age ln_Revenues Revenues_growth ln_Total_assets i.Industry i.GeograpicalArea) (treatment), nneighbor(10) gen(match) dmvariables
    In conclusion, I wanted to use the PSM just to find comparable firms but I am not sure about the previous procedure since it is the first time I am using it and the final result reveals there are some companies in the treatment group that do not match to anyone despite a large number of potential comparables firms which were previously carefully selected.

    I would absolutely appreciate any kind of feedback or suggestions for improvement. Thanks in advance for your help and patience.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float(treatment id) int year byte T long(Industry GeographicalArea) double ln_Firm_age float(Intangible_ratio ln_Total_assets ln_Revenues Revenues_growth) double profitloss int Employees
    1 1 2008 -2  3 19  3.091042453358316  .007965114  8.077967  7.384061          .    73.156 15
    1 1 2009 -1  3 19 3.1354942159291497  .015642287  8.234834  7.423266  .04000895    12.846 13
    1 1 2010  0  3 19 3.1780538303479458   .02827282  8.415235  7.350791 -.06995244    11.849 15
    1 1 2011  1  3 19 3.2188758248682006  .021038927  8.599017  7.630339   .3227375    10.533 14
    1 1 2012  2  3 19  3.258096538021482           .         .         .          .         .  .
    1 1 2013  3  3 19  3.295836866004329           .         .         .          .         .  .
    1 2 2007 -2 10 15 1.3862943611198906           .         .         .          .         .  .
    1 2 2008 -1 10 15 1.6094379124341003   .27431533  7.874802         0          .  -321.237  .
    1 2 2009  0 10 15  1.791759469228055   .26194736  8.569549         0          .   -438.63 11
    1 2 2010  1 10 15 1.9459101490553132     .187449  8.315349         0          . -1770.359 15
    1 2 2011  2 10 15 2.0794415416798357           .         .         .          .         .  .
    1 2 2012  3 10 15 2.1972245773362196           .         .         .          .         .  .
    0 3 2009  .  8  1  .6931471805599453           .         .         .          .         .  .
    0 3 2010  .  8  1 1.0986122886681098    .4749049  8.047207  8.943937          .  -205.595  .
    0 3 2011  .  8  1 1.3862943611198906    .4340765  8.078743  8.798356 -.13549794   -44.256  .
    0 3 2012  .  8  1 1.6094379124341003    .3699694  8.196082  8.880292  .08539934   -446.88 23
    0 3 2013  .  8  1  1.791759469228055           0  6.953561 2.8003254  -.9978505   -26.622  0
    0 3 2014  .  8  1 1.9459101490553132 .0041904794  7.023394 3.5101414  1.1005177  -142.908  0
    0 4 2007  .  1 15                  .           .         .         .          .         .  .
    0 4 2008  .  1 15                  0   .08861052  3.571418         0          .     9.566  0
    0 4 2009  .  1 15  .6931471805599453    .6654887  3.834602         0          .     -8.97  0
    0 4 2010  .  1 15 1.0986122886681098    .5810004  4.150646         0          .   -14.747  0
    0 4 2011  .  1 15 1.3862943611198906    .5515633  3.913921  1.252763          .   -15.963  0
    0 4 2012  .  1 15 1.6094379124341003   .24565923 4.2385316 1.0986123        -.2     8.916  0
    0 5 2012  .  1 15 1.0986122886681098    .3649698   5.02932  5.088633          .     1.534  0
    0 5 2013  .  1 15 1.3862943611198906    .3247113  5.509064  3.979308   -.674377   -59.257  0
    0 5 2014  .  1 15 1.6094379124341003    .4130051   5.22494         0         -1  -168.915  0
    0 5 2015  .  1 15  1.791759469228055           .         .         .          .         .  .
    0 5 2016  .  1 15 1.9459101490553132           .         .         .          .         .  .
    0 5 2017  .  1 15 2.0794415416798357           .         .         .          .         .  .
    end
    label values Industry Industry
    label def Industry 1 "C", modify
    label def Industry 3 "F", modify
    label def Industry 8 "K", modify
    label def Industry 10 "M", modify
    label values GeographicalArea NUTS2
    label def NUTS2 1 "ITC1", modify
    label def NUTS2 15 "ITH4", modify
    label def NUTS2 19 "ITI3", modify
Working...
X