Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to construct treatment and control group based on the same industries in the DID test?

    Hi Stata experts,

    I am trying to implement a difference-in-difference (DID) regression and find some difficulties in constructing the treatment and control groups based on deregulations in some specified industries and years. I need to define the control group as the firms in the same sic3 industries, but with different sic4 industries. So, I can keep similarities between them. Then, I shall restrict the event period to be from three years before to three years after each event. I'm quite confused about how to start this. Can anyone please give me some hints?

    Here is my sample data. Many thanks to you in advance!



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long gvkey float year double sic4 float sic3 double(depen indep)
    1004 2000 5080 508    .270463551727072 .11129193007946014
    1004 2001 5080 508  1.6483146147992638                  0
    1004 2014 5080 508   .2595504167163052                  0
    1004 2015 5080 508  1.4456641532722423                  0
    1004 2016 5080 508  1.4301815188307783                  0
    1004 2017 5080 508  .06651193238832491                  0
    1009 1992 3460 346   .7248196066826066  .6800046563148499
    1009 1993 3460 346   1.210178423204322  .6899944543838501
    1009 1994 3460 346    1.61961722871451  .6399991512298584
    1013 2003 3661 366  -.1571899460211795  .4449999928474426
    1013 2004 3661 366  1.2041092895984704  .4630001187324524
    1013 2005 3661 366  .42072738645833324  .4267875552177429
    1013 2006 3661 366  1.5649545574864725   .439971923828125
    1013 2007 3661 366   .5161695249965417  .4560580849647522
    1013 2008 3661 366   1.065011069272705 .42502060532569885
    1013 2009 3661 366  .34908213415980666  .3802548348903656
    1013 2010 3661 366  1.3168114726228735 .38561299443244934
    1050 2008 3564 356  1.1750447003647704                  0
    1055 1990 3571 357  1.3794294814974282                  0
    1055 1991 3571 357   .9866965676530831                  0
    1056 1998 3674 367  .22079930647189142 .15499617159366608
    1056 2001 3674 367   2.591821456484808 .10799886286258698
    1056 2002 3674 367   1.182396772815596 .10799767076969147
    1072 2014 3670 367  .14823916273620605                  0
    1072 2015 3670 367   .3649908994552586                  0
    1072 2016 3670 367   .1618882447941168                  0
    1072 2017 3670 367   .5851335550677529                  0
    1072 2018 3670 367  1.4344087654291133                  0
    1073 1990 7373 737 -.21919894674684137  .6103059649467468
    1073 1991 7373 737   3.925843251834408  .7399349808692932
    1073 1992 7373 737   .2969388273585185  .6599476933479309
    1073 1995 7373 737  -.7327295996416189  .5800233483314514
    1082 1990 1540 154  .08719758642423465  .4799616038799286
    1082 1991 1540 154  .45094126329230316    .66282719373703
    1082 1992 1540 154  .19061687114043133  .7242097854614258
    1082 1997 1540 154    .768209954920183  .5123208165168762
    1094 2003 5160 516   2.204409800268818                  0
    1094 2011 5160 516  1.0047125876487077                  0
    1094 2012 5160 516                   1                  0
    1094 2013 5160 516  1.2179454519601507                  0
    1094 2014 5160 516  1.7344836255744591                  0
    1094 2015 5160 516  2.5455304034143365 .13000068068504334
    1094 2016 5160 516  1.1369240732329473  .1399993598461151
    1094 2017 5160 516  1.8355225278682805 .22999978065490723
    1109 1995 2024 202  2.7016541861304337                  0
    1109 1996 3944 394  1.6226123098899485                  0
    1109 2005 3944 394   .6649959745803019   .360088586807251
    1109 2006 3944 394   .9022707230320697 .34005647897720337
    1109 2007 3944 394  1.2558767468069008 .23000000417232513
    1109 2008 3944 394  1.4215826455537088  .2499537467956543
    1111 1990 7372 737   2.336773633874905 .18999235332012177
    1111 1995 7372 737  2.0696081849126817 .14999139308929443
    1111 1996 7372 737 -.23614964569713925 .13999967277050018
    1111 2002 7372 737  1.0343471379900395 .28000059723854065
    1111 2003 7372 737  -.6103532620214409  .1600005030632019
    1111 2004 7372 737   .7619930754696743 .19999979436397552
    1111 2005 7372 737 -.41262026017417214 .22999991476535797
    1111 2006 7372 737   .7398461755717425  .3199999928474426
    1111 2007 7372 737   .7444859292807284  .2200002372264862
    1121 1990 5172 517   .8409829581564923  .4368971586227417
    1121 1991 5172 517   .0622641408346807  .4887270927429199
    1121 1992 5172 517  1.4684263005980962 .12298694998025894
    1121 1993 5172 517   .9065460964225657                  0
    1121 1997 5172 517    .779841873418654   .268926739692688
    1121 1998 5172 517  1.0852790696432069 .16158509254455566
    1121 1999 5172 517   .5866558217234114 .14140990376472473
    1121 2000 5172 517  1.0406173340868465 .14753295481204987
    1121 2003 5172 517   .8588846803166564  .1027916967868805
    1121 2006 5172 517  1.0604002874707852  .4178103506565094
    1121 2007 5172 517   .4343221734056306  .6908037066459656
    1121 2008 5172 517  .40532310946400674  .5660001635551453
    1121 2009 5172 517    .730009115072278  .7279999256134033
    1121 2010 5172 517  2.5462861280346156  .9790001511573792
    1121 2011 5172 517  1.1644757111517399  .5829998254776001
    1121 2012 5172 517  -.8207597382125889  .5490002036094666
    1121 2013 5172 517   .5095212386735103   .624000072479248
    1121 2014 5172 517  .14353837199173847  .3430001735687256
    1121 2015 5172 517  -.5211935252113744   .382000207901001
    1121 2016 5172 517   .8650104536530338  .6119995713233948
    1121 2017 5172 517 -.40928740785587747  .6139993667602539
    1121 2018 5172 517 -.08742956773406778 .41399991512298584
    1121 2019 5172 517  .38265263689097634 .48699983954429626
    1137 1990 3672 367  .29542707842886423  .1469968557357788
    1137 1991 3672 367  .43104426613493924                  0
    1137 1992 3672 367   .3503417809079537 .10199470818042755
    1137 1993 3672 367   .8479888812793633    .25900799036026
    1137 1994 3672 367  1.3304161968339807  .1749972105026245
    1151 1990 3842 384   .3906853897872559                  0
    1151 1991 3842 384 -.16102014271341772                  0
    1151 1992 3842 384   .7029400201251566                  0
    1151 1993 3842 384  .10613622428149376                  0
    1151 1994 3842 384  1.3688910879594527                  0
    1151 1995 3842 384    .378612814617232                  0
    1151 1996 3842 384 -.35069662906902377                  0
    1161 1990 3674 367  1.1149956557626215                  0
    1161 1996 3674 367  1.0203057340658366 .12999975681304932
    1161 1997 3674 367  1.1931414610690958 .11999957263469696
    1161 1998 3674 367   .8986887240821452 .12000003457069397
    1161 2004 3674 367   .9631844721623026                  0
    1161 2005 3674 367   1.201481751056755  .2659221291542053
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 39343 observations
    Use the count() option to list more

    .


    Here is my initial codes:


    Code:
    gen deregulated = 0
    replace deregulated = 0 if inlist(sic3, 491, 492, 421, 485)
    
    reg depen c.indep##deregulated $control i.year, cluster(gvkey)
    Last edited by Jae Li; 23 May 2022, 12:17.

  • #2
    I don't understand the problem. What did Stata do that you didn't want it to do? What id it not doing that you want it to do? What even defines the treatment group, here?

    Comment


    • #3
      @Jared Greathouse Hi Jared, thank you for your reply! The treatment group is firms of deregulated industries and the control group is other firms in the same sic3 industries but different in sic4 industries. Do you have any ideas about how to specify it in Stata? Many thanks!

      Comment


      • #4
        Let me use a real example so I can give you a better idea about what I'm asking you.
        Code:
        qui {
        *import delim "https://raw.githubusercontent.com/SucreRouge/synth_control/master/basque.csv", clear
        u "http://econ.korea.ac.kr/~chirokhan/panelbook/data/basque-clean.dta", clear
        
        loc int_time = 1975
        
        g treated = cond(regionno==17 & year >= `int_time',1,0)
        
        labvars year gdpcap "Year" "ln(GDP per 100,000)"
        
        replace regionname = trim(regexr(regionname,"\(.+\) *",""))
        
        egen id = group(regionname), label(regionname) // makes a unique ID
        
        order id, b(year)
        
        *keep if year >= 1960
        drop if inlist(id,18) //12
        
        drop regionno
        xtset id year, y
        
        cls
        }
        This dataset reflects a panel of 17 Spanish regions, one of which was treated in 1975, the others weren't. In this case, our treatment unit is the Basque Country (Unit 17). So, one way to do this is
        Code:
        loc int_time = 1975
        
        g treated = cond(regionno==17 & year >= `int_time',1,0)
        where we make a treatment variable. Specifically, the cond function follows the syntax of cond(a,b,c), where if A is true, do b, if A is false, do C. In this case, if the unit has the unit number 17, AND the year is greater than or equal to 1975, then we denote the treatment because that's when terrorist attacks began in the Basque Country. For all other observations NOT "after 1975 AND Unit 17", we replace the treatment variable with 0, since 16 units were untreated entirely and one unit was treated only at and after 1975.

        I see that deregulation is your treatment, and that's fine, but how am I meant to know which specific industries were treated or not? You don't specify. I'm also having trouble following your code. When I do
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input long gvkey float year double sic4 float sic3 double(depen indep)
        1004 2000 5080 508    .270463551727072 .11129193007946014
        1004 2001 5080 508  1.6483146147992638                  0
        1004 2014 5080 508   .2595504167163052                  0
        1004 2015 5080 508  1.4456641532722423                  0
        1004 2016 5080 508  1.4301815188307783                  0
        1004 2017 5080 508  .06651193238832491                  0
        1009 1992 3460 346   .7248196066826066  .6800046563148499
        1009 1993 3460 346   1.210178423204322  .6899944543838501
        1009 1994 3460 346    1.61961722871451  .6399991512298584
        1013 2003 3661 366  -.1571899460211795  .4449999928474426
        1013 2004 3661 366  1.2041092895984704  .4630001187324524
        1013 2005 3661 366  .42072738645833324  .4267875552177429
        1013 2006 3661 366  1.5649545574864725   .439971923828125
        1013 2007 3661 366   .5161695249965417  .4560580849647522
        1013 2008 3661 366   1.065011069272705 .42502060532569885
        1013 2009 3661 366  .34908213415980666  .3802548348903656
        1013 2010 3661 366  1.3168114726228735 .38561299443244934
        1050 2008 3564 356  1.1750447003647704                  0
        1055 1990 3571 357  1.3794294814974282                  0
        1055 1991 3571 357   .9866965676530831                  0
        1056 1998 3674 367  .22079930647189142 .15499617159366608
        1056 2001 3674 367   2.591821456484808 .10799886286258698
        1056 2002 3674 367   1.182396772815596 .10799767076969147
        1072 2014 3670 367  .14823916273620605                  0
        1072 2015 3670 367   .3649908994552586                  0
        1072 2016 3670 367   .1618882447941168                  0
        1072 2017 3670 367   .5851335550677529                  0
        1072 2018 3670 367  1.4344087654291133                  0
        1073 1990 7373 737 -.21919894674684137  .6103059649467468
        1073 1991 7373 737   3.925843251834408  .7399349808692932
        1073 1992 7373 737   .2969388273585185  .6599476933479309
        1073 1995 7373 737  -.7327295996416189  .5800233483314514
        1082 1990 1540 154  .08719758642423465  .4799616038799286
        1082 1991 1540 154  .45094126329230316    .66282719373703
        1082 1992 1540 154  .19061687114043133  .7242097854614258
        1082 1997 1540 154    .768209954920183  .5123208165168762
        1094 2003 5160 516   2.204409800268818                  0
        1094 2011 5160 516  1.0047125876487077                  0
        1094 2012 5160 516                   1                  0
        1094 2013 5160 516  1.2179454519601507                  0
        1094 2014 5160 516  1.7344836255744591                  0
        1094 2015 5160 516  2.5455304034143365 .13000068068504334
        1094 2016 5160 516  1.1369240732329473  .1399993598461151
        1094 2017 5160 516  1.8355225278682805 .22999978065490723
        1109 1995 2024 202  2.7016541861304337                  0
        1109 1996 3944 394  1.6226123098899485                  0
        1109 2005 3944 394   .6649959745803019   .360088586807251
        1109 2006 3944 394   .9022707230320697 .34005647897720337
        1109 2007 3944 394  1.2558767468069008 .23000000417232513
        1109 2008 3944 394  1.4215826455537088  .2499537467956543
        1111 1990 7372 737   2.336773633874905 .18999235332012177
        1111 1995 7372 737  2.0696081849126817 .14999139308929443
        1111 1996 7372 737 -.23614964569713925 .13999967277050018
        1111 2002 7372 737  1.0343471379900395 .28000059723854065
        1111 2003 7372 737  -.6103532620214409  .1600005030632019
        1111 2004 7372 737   .7619930754696743 .19999979436397552
        1111 2005 7372 737 -.41262026017417214 .22999991476535797
        1111 2006 7372 737   .7398461755717425  .3199999928474426
        1111 2007 7372 737   .7444859292807284  .2200002372264862
        1121 1990 5172 517   .8409829581564923  .4368971586227417
        1121 1991 5172 517   .0622641408346807  .4887270927429199
        1121 1992 5172 517  1.4684263005980962 .12298694998025894
        1121 1993 5172 517   .9065460964225657                  0
        1121 1997 5172 517    .779841873418654   .268926739692688
        1121 1998 5172 517  1.0852790696432069 .16158509254455566
        1121 1999 5172 517   .5866558217234114 .14140990376472473
        1121 2000 5172 517  1.0406173340868465 .14753295481204987
        1121 2003 5172 517   .8588846803166564  .1027916967868805
        1121 2006 5172 517  1.0604002874707852  .4178103506565094
        1121 2007 5172 517   .4343221734056306  .6908037066459656
        1121 2008 5172 517  .40532310946400674  .5660001635551453
        1121 2009 5172 517    .730009115072278  .7279999256134033
        1121 2010 5172 517  2.5462861280346156  .9790001511573792
        1121 2011 5172 517  1.1644757111517399  .5829998254776001
        1121 2012 5172 517  -.8207597382125889  .5490002036094666
        1121 2013 5172 517   .5095212386735103   .624000072479248
        1121 2014 5172 517  .14353837199173847  .3430001735687256
        1121 2015 5172 517  -.5211935252113744   .382000207901001
        1121 2016 5172 517   .8650104536530338  .6119995713233948
        1121 2017 5172 517 -.40928740785587747  .6139993667602539
        1121 2018 5172 517 -.08742956773406778 .41399991512298584
        1121 2019 5172 517  .38265263689097634 .48699983954429626
        1137 1990 3672 367  .29542707842886423  .1469968557357788
        1137 1991 3672 367  .43104426613493924                  0
        1137 1992 3672 367   .3503417809079537 .10199470818042755
        1137 1993 3672 367   .8479888812793633    .25900799036026
        1137 1994 3672 367  1.3304161968339807  .1749972105026245
        1151 1990 3842 384   .3906853897872559                  0
        1151 1991 3842 384 -.16102014271341772                  0
        1151 1992 3842 384   .7029400201251566                  0
        1151 1993 3842 384  .10613622428149376                  0
        1151 1994 3842 384  1.3688910879594527                  0
        1151 1995 3842 384    .378612814617232                  0
        1151 1996 3842 384 -.35069662906902377                  0
        1161 1990 3674 367  1.1149956557626215                  0
        1161 1996 3674 367  1.0203057340658366 .12999975681304932
        1161 1997 3674 367  1.1931414610690958 .11999957263469696
        1161 1998 3674 367   .8986887240821452 .12000003457069397
        1161 2004 3674 367   .9631844721623026                  0
        1161 2005 3674 367   1.201481751056755  .2659221291542053
        end
        
        cls
        
        g deregulated = 0
        replace deregulated = 0 if inlist(sic3, 491, 492, 421, 485)
        
        tab deregulated
        I get
        Code:
        deregulated |      Freq.     Percent        Cum.
        ------------+-----------------------------------
                  0 |        100      100.00      100.00
        ------------+-----------------------------------
              Total |        100      100.00
        but presumably this is not what you want. So what I need you to tell me, is which specific industries were treated and what time they were treated at. As a matter of fact, let's be simple: tell me the name of one treated industry in this example dataset, and what you'd like it's control group to be. Then, we can generalize the code to include other treated units. But in order for me to do this, you must tell me the specific unit that was treated, that is, their gvkey numbers and what year they received treatment. I can follow much if what you say, but I want to give the most precise advice as possible.


        EDIT: An example with my data defines control units as northern, bordering units to the Basque Country
        Code:
        cd "$MP/Figures"
        qui {
        *import delim "https://raw.githubusercontent.com/SucreRouge/synth_control/master/basque.csv", clear
        
        loc int_time = 1975
        
        
        sysuse basque, clear
        
        replace regionname = "Pais Vasco" if regionname == "Basque Country (Pais Vasco)"
        labvars year gdpcap "Year" "ln(GDP per 100,000)"
        
        replace regionname = trim(regexr(regionname,"\(.+\) *",""))
        
        egen id = group(regionname), label(regionname) // makes a unique ID
        
        order id, b(year)
        
        *keep if year >= 1960
        drop if inlist(id,18) //12
        
        drop regionno
        xtset id year, y
        
        g treat = cond(id==15 & year >=1975,1,0)
        cls
        }
        
        
        g north = cond(inlist(id,5,11,15,16),1,0)
        
        g border = cond(id==5,1,0)
        
        g donor = cond(north == 1 & border ==1,1,0)
        
        keep if donor ==1 | id ==15
        
        br
        Last edited by Jared Greathouse; 24 May 2022, 11:12.

        Comment


        • #5
          @Jared Greathouse Hi Jared, thank you so much for providing me with this great example! My apologies for the
          Code:
          deregulated
          variable because the Stata only shows up first 100 observations when using -dataex- and the industries receiving deregulations are not in the first 100 observations. In order to make it work in the sample data, I select that the industries
          specific that were treated are
          Code:
          sic3 = 517, 737, 366, and 367
          and the year they received treatment is
          Code:
          year = 1992 and 1996
          , respectively because there were two government acts. However, the control group requires those firms in the same sic3 industries as the treatment group, but with different sic4 industries, e.g., 3674 and 3672 are in the same sic3 industries but in different sic4 industries.

          The window period also needs to be from three years before to three years after each event. I'm very looking forward to hearing your input/advice on this coding. I will very much appreciate that!
          If you have any questions, please do not hesitate to let me know. Many thanks to you!
          Last edited by Jae Li; 26 May 2022, 12:14.

          Comment


          • #6
            Hi Statalist, is there anyone who possibly knows how to do it?

            Comment


            • #7
              So wait, were there two treatments or were the units just treated at different times?

              Comment


              • #8
                @Jared Greathouse Thank you for your reply! The latter is right! They are treated at different times. Looking forward to hearing from you!

                Comment

                Working...
                X