Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Staggered Difference-in-Differences | Repost

    So, I am trying to estimate the effect of contract enforcement on debt maturity (independent var on dep var) using a staggered difference-in-differences specification. The idea is that contract enforcement was introduced in different states on different years (hence the need for a staggered diff-in-diff). The Dataset (included as an attachement in .dta format "Date Set.dta" and as a link to a screenshot) includes different codes for different states, different codes for different companies (within a specific state), independent variables (related to the debt maturity question) and year as a variable for years.

    As Follows:

    Screenshot:

    HTML Code:
    https://ibb.co/JBCV2wM
    Dataex command:

    Code:
    * Example generated by -dataex-. For more info, type help dataex clear input long co_code double(sa_short_term_bank_borrowings sa_long_term_bank_borrowings sa_bank_borrowings sa_total_ass 11 68.1 .6 68.7 507.9 1 1998 80 12 11 61 110.8 171.8 779.4 1.2 2001 80 12 11 80.8 110.3 191.1 829.9 .5 2002 80 12 11 54.8 186 240.8 698.5 .9 2005 80 12 11 72.5 186.4 258.9 742.9 1.1 2006 80 12 11 107.9 175.3 283.2 845.1 1.3 2007 80 12 11 170.6 147.4 318 978.4 1.6 2008 80 12 11 298.2 158.4 456.6 1210.4 2.3 2009 80 12 11 417.4 270.7 688.1 1583.7 3 2010 80 12 14 312.7 0 312.7 4229.4 1.8 1993 640 20 14 513.3 0 513.3 7161.9 1 1994 640 20 14 1639.3 0 1639.3 8807.1 .5 1995 640 20 14 1748.9 0 1748.9 10634.3 .1 1996 640 20 14 1891.4 0 1891.4 11180.7 .1 1997 640 20 66 0 227.7 227.7 623.6 1.2 2007 860 38 66 15.6 210.1 225.7 653.5 1.1 2008 860 38 66 0 236.5 236.5 676.3 .4 2010 860 38 87 0 54.9 54.9 126.7 .1 2000 130 33 87 .9 62.4 63.3 115.3 . 2001 130 33 87 0 40.9 40.9 100.8 .8 2003 130 33 87 .2 10.4 10.6 91.9 . 2005 130 33 87 1.2 .1 1.3 91.2 .4 2006 130 33 228 86.7 4.4 91.1 469 .2 1993 260 13 228 87.3 2.2 89.5 589.8 .1 1994 260 13 228 76.2 1.1 77.3 554.6 2.4 1995 260 13 289 10.3 0 10.3 241 5.6 1993 460 38 289 5 0 5 328.8 1.1 1994 460 38 289 8.5 7.1 15.6 520.9 2.5 1996 460 38 289 12.6 4.2 16.8 591.1 2.1 1997 460 38 289 150.8 8.3 159.1 699.5 18 2005 460 38 289 217.2 23.7 240.9 686.8 4.2 2006 460 38 328 0 .2 .2 28.8 . 2005 460 20 328 0 .3 .3 32.9 . 2006 460 20 328 0 2.8 2.8 35 . 2007 460 20 328 0 4.8 4.8 49.3 1.1 2008 460 20 328 0 1.7 1.7 41.5 .4 2009 460 20 337 69.7 .7 70.4 950.6 3.2 1996 590 20 337 152.3 .6 152.9 1189 2.7 1997 590 20 337 135.2 .1 135.3 618.6 .1 1998 590 20 337 159.6 0 159.6 523.2 .4 1999 590 20 337 0 2.3 2.3 1126.6 .9 2004 590 20 337 0 1.4 1.4 1391.2 3.1 2005 590 20 337 0 .7 .7 1490.1 3.2 2006 590 20 337 0 3.2 3.2 1560.8 .7 2007 590 20 337 0 17.5 17.5 1654.4 1 2008 590 20 337 0 11.7 11.7 1723.3 7.7 2009 590 20 337 0 12.6 12.6 1469.2 2.6 2010 590 20 363 0 46.6 46.6 423.9 .4 1994 820 20 363 0 268 268 1022.8 1.6 1995 820 20 363 0 426.3 426.3 1538.9 7.6 1996 820 20 363 30 374.2 404.2 2403.7 9.9 1997 820 20 363 5 312.9 317.9 2485.7 15.7 1998 820 20 363 .3 219.2 219.5 2486.9 28.5 1999 820 20 363 0 188.1 188.1 2468.4 18.1 2000 820 20 363 0 115 115 2337.6 13.6 2001 820 20 363 4.5 277.8 282.3 2035.9 16.4 2002 820 20 363 7.7 437.1 444.8 1940 20.7 2003 820 20 363 21.3 605.1 626.4 1974.5 10.1 2004 820 20 363 2.3 472 474.3 1992.6 23.2 2005 820 20 363 0 817.8 817.8 2779.1 30.7 2006 820 20 363 0 1715.9 1715.9 4494 35.6 2007 820 20 363 301.7 1683.8 1985.5 6335.9 53.7 2008 820 20 363 .3 2126.5 2126.8 7898.6 72.6 2009 820 20 363 0 2175.3 2175.3 6647.1 61 2010 820 20 365 316 90.8 406.8 2097.7 .4 2001 300 12 365 589.6 36 625.6 4059.1 1.4 2002 300 12 365 300.4 18 318.4 4337.3 .7 2003 300 12 365 3217 250 3467 16359.9 .9 2007 300 12 365 4199.5 250 4449.5 21981 .8 2008 300 12 365 7682.4 6661.9 14344.3 42052.5 1.7 2009 300 12 365 14340.7 11098.3 25439 53942.3 .8 2010 300 12 372 38.4 .5 38.9 983.7 4.5 1999 550 10 372 0 138 138 860.4 9.1 2003 550 10 372 0 69.8 69.8 845.5 10.6 2004 550 10 372 0 .3 .3 832.1 12.4 2005 550 10 381 45.2 50 95.2 365.3 .1 2002 290 33 381 112.8 129.9 242.7 506 . 2003 290 33 381 88 125.6 213.6 571.2 .4 2004 290 33 381 202.3 192.1 394.4 775.2 1.3 2005 290 33 381 161.7 156.6 318.3 915.5 . 2006 290 33 381 66.6 183.1 249.7 1019.6 .3 2007 290 33 381 0 36.7 36.7 1201.8 . 2008 290 33 381 50.6 111.4 162 1284.9 . 2009 290 33 400 2.7 0 2.7 6.7 . 1994 770 33 400 14 0 14 63 .5 1995 770 33 400 0 9.3 9.3 43.2 . 2002 770 33 402 72.5 0 72.5 127.4 . 1994 410 38 402 4.7 32.3 37 425 1.4 2002 410 38 402 4.8 30.7 35.5 536.4 .3 2003 410 38 402 6.4 41.3 47.7 408.4 1 2005 410 38 402 0 36.4 36.4 406.1 .8 2006 410 38 402 0 41.2 41.2 420 . 2007 410 38 402 0 39.9 39.9 453 1.2 2008 410 38 402 0 30.8 30.8 417.8 1.3 2009 410 38 402 0 45.7 45.7 468.6 .4 2010 410 38 414 .8 .2 1 151.6 .1 2001 460 10 414 0 16.5 16.5 149.9 . 2002 460 10 414 0 .6 .6 137.1 .1 2005 460 10 414 0 1.6 1.6 198.5 . 2007 460 10 414 0 1 1 203.9 .1 2008 460 10 end


    As previously mentioned, different groups of states have different treatment years (the treatment years for each state code is available as an attachment "Deregulation Years.xlsx"). Some states got treated (law passed) starting 1995, others starting 1997, 1999 and 2000. So, essentially we got 4 treatment timelines for different states.

    As follows:

    Screenshot:

    https://ibb.co/M6qgvwD

    And actual sheet is an attachment.



    The task is to calculate the staggered diff-in-diff coefficient for treatment in general!

    I started by sorting the dataset using: (the command, results and codes are all in the attached pdf "Commands & Results.pdf")

    Code:
    sort st_code year
    I then created dummy variables for all states and all years so that i can account for state fixed effects and time fixed effects (as far as i am aware):

    Code:
    tabulate st_code , generate(statedummy)
    Code:
    tabulate year , generate(yeardummy)
    The most important step was to create the treated variable in order to be able to carry out the regression, i did this using the following code: (the conditions correspond to the year of treatment for each state in the dataset)

    Code:
    generate Treated = statedummy1==1 if year >= 1995
    Code:
    replace treated = 1 if ((st_code==3 & year>=1997) | (st_code==4 & year>=1997) | (st_code==5 & year>=1997) | (st_code==6 & year>=1995) | (st_code==8 & year>=1995) | (st_code==9 & year>=1995) | (st_code==10 & year>=1995) | (st_code==11 & year>=2000) | (st_code==12 & year>=1995) | (st_code==13 & year>=1995) | (st_code==14 & year>=1995) | (st_code==17 & year>=1995) | (st_code==18 & year>=1997) | (st_code==19 & year>=1999) | (st_code==20 & year>=2000) | (st_code==22 & year>=1997) | (st_code==26 & year>=1997) | (st_code==27 & year>=1997) | (st_code==29 & year>=1997) | (st_code==30 & year>=1995) | (st_code==31 & year>=1995) | (st_code==33 & year>=1997) | (st_code==36 & year>=1999) | (st_code==37 & year>=1999) | (st_code==38 & year>=1995))
    I then created the dependent variable of interest to the research question as follows:

    Code:
    generate ltermonassets = sa_long_term_bank_borrowings/ sa_total_assets
    And the most important step was to carry out the regression using:

    Code:
    regress ltermonassets treated i.st_code i.year
    This provided me with a coefficient for treatment, however i have doubts that i made a mistake along the way in that the dummies i regressed on, aren't neccessarily the ones i should've incorporated.

    I tried doing it differently by using the xtreg command while using:

    Code:
     xtset st_code year
    but i got an error:
    repeated time values within panel
    So, my question is, did i carry out the staggered diff-in-diff correctly (i.e.does the coefficient of treated actually represent the required one) and if not, what is it that i missed (failed to incorporate) or did wrong?

    Thanks alot in advance!
    Attached Files

  • #2
    Somehow your -dataex- output got mangled in posting. In addition to just being one long line, the variable list apparently got cut short somehow. So please re-do the -dataex- and post it. I know you have also attached a Stata data set. While that will contain all the information -datatex- has, and perhaps more, attachments (of anything) are discouraged here because some members are unwilling to risk downloading things from people they don't know. So please post a corrected -dataex- to make this available to all interested Forum members. Similarly, to show results, paste them here in the Forum between code delimiters is preferred over attaching a PDF. Thank you.

    As for the error message you got from -xtset-, it means exactly what it says. You have some values of st_code where there is more than one observation for some year(s). That is not allowed in panel data. It may be that your data is not, in the strict sense, panel data. But if it was your expectation that each st_code would have at most one observation in any given year, then your data set is not what you believe it to be and you need to fix that before doing anything else. The first step would be to find the surplus observations with
    Code:
    duplicates tag st_code year, gen(flag)
    browse if flag
    Then you need to review the data management that created this data set to see how they got there, and fix the mistakes, along with any other errors you might stumble on along the way. No point proceeding to analysis with the wrong data!

    Or you may realize that there is no reason to expect only a single observation per st_code in any year. If that is the case, then just change your -xtset- command to -xtset st_code-. That will still give you access to any -xt- commands that don't require time-series operators (lags, leads, etc.) or autoregressive correlation structure.

    The way you generated your indicator ("dummy") variables is fine, but unnecessary. You can rely instead on Stata's factor variable notation to create (virtual) indicator variables on the fly in other commands. In fact, the -regress- command you show does exactly that. And none of the code you show makes any use of the homebrew indicator variables you created. So may as well skip those steps and not clutter up your data set with a bunch of unused variables.

    Comment

    Working...
    X