Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using difference-in-differences for a propensity score matched sample with caliper matching

    Dear Statalists!

    I want to run a difference in difference analysis in Stata, analyzing the impact of a board gender policy introduction on targeted firms. Targeted firms are those included in a particular index (captured by dummy variable inclindex) with 2018 as pre-treatment period and 2020 as post-treatment period for my major analyses.
    I have pooled cross-sectional data containing board gender information on firms from 2008 to 2020.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double CompanyID float year double genderratio float(inclindex inclindex2020 av_genratio_bysic) byte sic_first_two_digits_numeric
       1383 2008 .8890000000000001 1 0   .86325  1
       6727 2008              .875 1 0   .86325  1
      21032 2008 .8000000000000002 1 0   .86325  1
      30201 2008 .8890000000000001 1 0   .86325  1
       5507 2008 .8570000000000001 1 0     .857  2
     855753 2008                 1 0 0        1  7
      32407 2008                 1 1 0        1  7
      29534 2008                 1 0 0        1  7
     928025 2008                 1 0 0 .9632609 10
      14555 2008                 1 1 0 .9632609 10
     896926 2008                 1 0 0 .9632609 10
    1040083 2008                 1 0 0 .9632609 10
       7223 2008                 1 1 0 .9632609 10
      19019 2008                .5 0 0 .9632609 10
     641563 2008                 1 0 0 .9632609 10
      22069 2008 .8330000000000001 1 0 .9632609 10
     930452 2008                 1 0 0 .9632609 10
      13516 2008                 1 0 0 .9632609 10
     912623 2008                 1 0 0 .9632609 10
      24744 2008                 1 0 0 .9632609 10
    1027165 2008                 1 0 0 .9632609 10
      29267 2008 .8890000000000001 1 0 .9632609 10
    1235823 2008                 1 0 0 .9632609 10
      28009 2008                 1 0 0 .9632609 10
    1019871 2008                 1 0 0 .9632609 10
      12493 2008 .9329999999999999 1 0 .9632609 10
       3324 2008                 1 0 0 .9632609 10
     917665 2008                 1 1 0 .9632609 10
      29256 2008                 1 1 0 .9632609 10
      20764 2008                 1 0 0 .9632609 10
      32799 2008                 1 0 0 .9632609 10
     604915 2008                 1 1 0 .9476666 12
      21455 2008                 1 1 0 .9476666 12
     754792 2008                 1 1 0 .9476666 12
    1005041 2008                 1 1 0 .9476666 12
     598387 2008 .8890000000000001 1 0 .9476666 12
     914699 2008                 1 0 0 .9476666 12
       1467 2008 .8330000000000001 0 0 .9476666 12
      33380 2008                 1 1 0 .9476666 12
      33052 2008                 1 1 0 .9476666 12
      19772 2008 .9000000000000001 1 0 .9476666 12
     746499 2008               .75 1 0 .9476666 12
     566406 2008                 1 1 0 .9476666 12
      86176 2008                 1 1 0 .9623077 13
      22049 2008 .8329999999999997 1 0 .9623077 13
       2138 2008 .8000000000000002 1 0 .9623077 13
      10667 2008                 1 1 0 .9623077 13
     637778 2008                 1 1 0 .9623077 13
      24502 2008                 1 1 0 .9623077 13
      12696 2008                 1 1 0 .9623077 13
     665249 2008                 1 0 0 .9623077 13
      13388 2008 .9000000000000001 1 0 .9623077 13
      13132 2008                 1 1 0 .9623077 13
      20318 2008                 1 1 0 .9623077 13
       9002 2008                 1 1 0 .9623077 13
      10101 2008                 1 0 0 .9623077 13
      23730 2008                 1 1 0 .9623077 13
     467702 2008                 1 1 0 .9623077 13
      29605 2008                 1 1 0 .9623077 13
      24355 2008              .769 1 0 .9623077 13
    1042184 2008                 1 0 0 .9623077 13
      20703 2008                 1 0 0 .9623077 13
      22312 2008                 1 1 0 .9623077 13
     482852 2008 .9000000000000001 0 0 .9623077 13
    1025180 2008                 1 0 0 .9623077 13
       9634 2008                 1 1 0 .9623077 13
      11769 2008 .8330000000000001 0 0 .9623077 13
      19977 2008 .8570000000000001 1 0 .9623077 13
      82989 2008                 1 1 0 .9623077 13
      26667 2008 .8890000000000001 1 0 .9623077 13
    1067306 2008 .8570000000000001 1 0 .9623077 13
      17077 2008               .75 0 0 .9623077 13
      22899 2008 .9170000000000004 1 0 .9623077 13
       2307 2008 .9169999999999999 1 0 .9623077 13
     827641 2008                 1 1 0 .9623077 13
     925876 2008 .8570000000000001 0 0 .9623077 13
       5506 2008                 1 1 0 .9623077 13
     644340 2008                 1 0 0 .9623077 13
      25756 2008                 1 1 0 .9623077 13
      23716 2008 .8000000000000002 0 0 .9623077 13
       2993 2008 .8330000000000001 1 0 .9623077 13
      11364 2008                 1 1 0 .9623077 13
     938013 2008 .8570000000000001 0 0 .9623077 13
        531 2008                 1 1 0 .9623077 13
     917978 2008                 1 0 0 .9623077 13
      14212 2008 .9000000000000001 1 0 .9623077 13
     141137 2008                 1 1 0 .9623077 13
       3642 2008                 1 1 0 .9623077 13
       2520 2008                 1 1 0 .9623077 13
     780575 2008                 1 1 0 .9623077 13
       9103 2008 .8890000000000001 1 0 .9623077 13
    1006743 2008                 1 0 0 .9623077 13
      32067 2008 .9169999999999999 1 0 .9623077 13
      24266 2008 .8890000000000001 1 0 .9623077 13
       6592 2008              .909 1 0 .9623077 13
     630809 2008                 1 0 0 .9623077 13
      23784 2008                 1 1 0 .9623077 13
       1857 2008                 1 0 0 .9623077 13
    1025988 2008                 1 0 0 .9623077 13
      10762 2008                 1 0 0 .9623077 13
    end

    My approach:
    (1) Using propensity score matching to form a matched control sample using psmatch2 command with the average genderratio by industry as a covariate:

    Code:
     psmatch2 inclindex av_genratio_bysic, out(genderratio) radius caliper(0.01)
    Further, I deploy pstest to assess the quality of the match

    Code:
     pstest av_genratio_bysic
    The psmatch2 command generates the following new variables: _pscore _treated _support _weight _genderratio

    (2) Test for parallel trends with didregress
    Now, I would like to test for parallel trends prior to 2018 by using the matched sample from (1). However, I am puzzled on how to use the new variables to form the matched treatment and control group.

    For a non-matched control sample, I would proceed as follows

    Code:
     didregress (genderratio c.sic_first_two_digits_numeric c.numberdirectors) (inclindex2020), group (inclindex) time (year)
    Visual inspection of parallel trends using
    Code:
     estat trendplots
    Testing of parallel trends using
    Code:
     estat ptrends
    But how do I have to adjust the above code to use the propensity score matched sample?

    For the code regress, I found the following https://www.ssc.wisc.edu/sscc/pubs/stata_psmatch.htm which suggests using the frequency weights as depicted in the below reported codes for nearest neighbour matched samples:

    Code:
    psmatch2 t x1 x2, out(y) logit
    reg y x1 x2 t [fweight=_weight]
    Simply applying this code to didregress like
    Code:
     didregress (genderratio c.sic_first_two_digits_numeric c.numberdirectors) (inclindex2020), group (inclindex) time (year) [fweight=_weight]
    does not work as it requires integer values for the _weight variable which is not the case for the weights in caliper matching. Logically, the code results in the error " option [fweight=_weight] not allowed r(198)".

    Checking help weight, help psmatch2 and help didregress I could not clarify how to use the results from caliper PSM in diff-in-diff analyses.


    Do you have any advice on how to derive the treatment and control group after using psmatch2 with caliper matching and how I can include the matched control group in my didregress code?


    Thanks in advance - your help is highly appreciated!

    (I am using Stata 18)

  • #2
    try aweight

    Comment


    • #3
      Thanks a lot! Using aweight and slightly adjusting the syntax worked for me!

      For those interested, I used the following code
      Code:
      didregress (genderratio c.sic_first_two_digits_numeric c.numberdirectors) (inclindex2020) [aweight=_weight], group (inclindex) time (year)
      Further, I used the descriptive statistics command to compare the matched control sample to the treatment sample based on a set of (arbitrary) characteristics. I used the following code:
      Code:
      dtable noquals genderratio networksize TimeBrd TimeInCo [aweight = _weight], by(inclindex)
      As a result, I get that the number of observations N for both groups - treatment and control - is the same. Before matching, the control group consists of 11,043 observations and the treatment group consists of 25,788 observations. After caliper matching as described in #1, the control group and the treatment group consist of 25,788 observations each.
      This appears surprising to me. I would have expected such a result for one-to-one but not caliper matching. I am not sure whether this is how it is supposed to be as I feel I am missing out on the intuition underlying this result. I performed a sensitivity analysis by changing caliper size but the number of observations per group remains the same.

      Is it supposed to be like this? If yes, what is the underlying rationale?


      Thanks for shedding some light on this matter!

      Comment

      Working...
      X