Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Applying difference in difference model for panel data with different treatment moments

    Dear Statalist users,

    I would like to use a Difference in Difference model. My data involves panel data, firm-level analysis over the period 2012-2017.
    I am relatively new to Stata and therefore I would like to ask for some help with how to do a regression using the dif-in-dif model.

    I would like to measure the effect of the use of assurance services on financial performance. Financial performance in this case is measured as return on assets ("roa").
    Control variables are firm size (measures by total assets) and industry (measures by GIC industries "gind").

    My treatment is the use of third party assurance services. I made a dummy variable which is 1 for a firm that uses assurance services and 0 for a firm that doesn't.
    I also made two variables 'firmid' and 'timeid'.

    The tricky thing about my data is that the treatment moments differ for each firm. E.g. some firms might not be treated until 2015 and then are treated after 2015. Another firm might be treated in the years 2012 and 2013, but treatment stopped in the following years. And then there are firms that are not treated in for instance 2012, 2013 and 2014, then are treated only in 2015, and not treated again in 2016 and 2017.

    The research design that I have come up with is as follows:
    𝑌𝑖𝑡=𝛼0+𝛼1𝐴𝑠𝑠𝑢𝑟𝑎𝑛𝑐𝑒𝑖𝑡+𝛽𝑋𝑖𝑡+𝑢𝑖+𝑑𝑡+𝜀𝑖𝑡
    where
    Y=roa
    The coefficient on Assurance, 𝛼1, is the variation in the dependent variable Y caused by differences between the treated and the control group in the period t when the assurance is used.
    X= the vector of controls
    𝑢𝑖= the fixed variation effect caused by external control industry
    𝑑𝑡=the fixed variation effect caused by year specific changes

    I hope I am clear enough in my explanation of what I need. See example of data below.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str34 nameyear str3 ExternalAssurance float(roa ExternalAssurancedum firmsize) str6 gind
    "REXAM2012"                     "No"     .033631936 0    6363 "151030"
    "REXAM2015"                     "Yes"     .03750258 1    4853 "151030"
    "BP2012"                        "Yes"     .03857518 1  300193 "101020"
    "BP2013"                        "Yes"     .07670843 1  305690 "101020"
    "BP2014"                        "Yes"    .013288545 1  284305 "101020"
    "BP2015"                        "Yes"    -.02476397 1  261832 "101020"
    "BP2016"                        "Yes"   .0004329399 1  263316 "101020"
    "BP2017"                        "Yes"      .0122525 1  276515 "101020"
    "GLAXOSMITHKLINE2012"           "Yes"      .1100663 1   41475 "352020"
    "GLAXOSMITHKLINE2013"           "Yes"      .1291641 1   42086 "352020"
    "GLAXOSMITHKLINE2014"           "Yes"     .06779661 1   40651 "352020"
    "GLAXOSMITHKLINE2015"           "No"       .1575796 0   53446 "352020"
    "GLAXOSMITHKLINE2016"           "No"     .015436435 0   59081 "352020"
    "GLAXOSMITHKLINE2017"           "No"     .027172275 0   56381 "352020"
    "UNILEVER2012"                  "No"      .09704112 0   46166 "303020"
    "UNILEVER2013"                  "No"      .10638718 0   45513 "303020"
    "UNILEVER2014"                  "No"       .1076686 0   48027 "303020"
    "UNILEVER2015"                  "Yes"     .09386592 1   52298 "303020"
    "UNILEVER2016"                  "Yes"     .09186766 1   56429 "303020"
    "UNILEVER2017"                  "Yes"      .1004064 1   60285 "303020"
    "BARCLAYS2012"                  "Yes"  -.0006985072 1 1490321 "401010"
    "BARCLAYS2013"                  "Yes"   .0004115016 1 1312267 "401010"
    "BARCLAYS2014"                  "Yes" -.00008837136 1 1357906 "401010"
    "BARCLAYS2015"                  "Yes" -.00028928262 1 1120012 "401010"
    "BARCLAYS2016"                  "No"    .0012875827 0 1213126 "401010"
    "BARCLAYS2017"                  "Yes"   .0005179802 1 1133248 "401010"
    "THALES2012"                    "No"     .025112037 0 21332.4 "201010"
    "THALES2013"                    "No"      .02667622 0 21494.8 "201010"
    "THALES2014"                    "Yes"    .035726614 1 19990.7 "201010"
    "THALES2015"                    "No"     .035486683 0 21560.2 "201010"
    "THALES2016"                    "No"      .04171074 0 22689.6 "201010"
    "THALES2017"                    "No"     .035217576 0 23332.1 "201010"
    "VODAFONE GRP2013"              "No"      .09148063 0  121840 "501020"
    "VODAFONE GRP2014"              "Yes"     .04653553 1  122573 "501020"
    "VODAFONE GRP2015"              "Yes"   -.030094307 1  133713 "501020"
    "VODAFONE GRP2016"              "Yes"   -.014157896 1  154684 "501020"
    end
    kind regards, Fleur

  • #2
    So, you will need to do some surgery on that nameyear variable, because you need those to be separate variables in the model. Also, you need a numeric variable to correspond to the name part of nameyear. In the example data you show, the variable gind seems to serve that purpose, but in the code below I have calculated a separate variable because the name gind doesn't suggest anything to me and I'm not confident it really does correspond exactly to names.

    Code:
    //    CREATE SEPARATE VARIABLES FOR NAME AND YEAR
    gen year = real(substr(nameyear, -4, 4))
    gen name = substr(nameyear, 1, length(nameyear)-4)
    
    //    NOW CREATE A NUMERIC VARIABLE FOR NAME SO IT CAN
    //    SERVE AS AN EFFECT IN THE REGRESSION
    encode name, gen(firm)
    
    xtset firm year
    
    xtreg roa i.ExternalAssurancedum /*covariates go here*/ i.year, fe
    I have used fixed-effects regression because that seems to be more commonly used in finance and economics than random effects (or some other approach to panel data regression such as -xtgls-), but you will have to make that decision for yourself.

    My treatment is the use of third party assurance services. I made a dummy variable which is 1 for a firm that uses assurance services and 0 for a firm that doesn't.
    I also made two variables 'firmid' and 'timeid'.
    Neither of these is true of the data you show. I suppose firmid and timeid are substantively the same as the variables name and year that I create in the code above. If so, feel free to use yours instead of mine. Your variable ExternalAssurancedum is not as you describe. Your variable ExternalAssurancedum takes on the value 1 for a firm that uses external assurances, but only in those years where the external assurances are in use; it is 0 otherwise. In fact, this is precisely what you need here. So use it the way it is, but when you write up your study be sure to describe it correctly.

    Comment


    • #3
      Dear Clyde, thank you for the explanation. The tip for separating the nameyear variable is very much appreciated.
      I do have some follow up questions:

      First of all, after doing xtset I got the following:

      Code:
      xtset firm year
      panel variable: firm (unbalanced)
      time variable: year, 2012 to 2017, but with gaps
      delta: 1 unit

      Does the fact that my data is unbalanced cause a problem for the regression? Treatment thus takes place in different years and there is not 1 treatment period, plus I do not have data of all firms in all the same years (see data sample below). I am now thus also doubting whether DID is even really possible for this type of panel data?

      Furthermore, I have done the xtreg command, but unfortunately this resulted in no significance. Therefore, I think I might have to approach the regression in a different way in order to check for significance to be able to conclude whether the use of assurance services has an effect on return on assets or not.

      FYI, the variable Sector_n stands for the industry in which the firm operates and the variable firmsize is measured by total assets.

      Code:
      xtreg roa i.ExternalAssurancedum Sector_n firmsize i.year, fe
      where roa is the dependent variable, ExternalAssurancedum is the independent dummy variable, Sector_n and firmsize are control variables and year is another dependent variable.

      Could the fact that I only find significance in 2017 be due to the fact that my data is unbalanced? Could it furthermore be due to the fact that some firms have no external assurance in all the years for which I have data on that particular firm and some firms do have external assurance in all the years, causing the before and after treatment not being visible for these two type of firms?

      I have found an example of someone who created multiple dummies for an unbalanced panel data set, which I think I might have to apply as well on https://stats.stackexchange.com/ques...google_rich_qa
      However, as far as I am concerned my problem is that I cannot observe for the non-treated firms (no assurance in any of the years) if they have even considered treatment and if so in what year (I cannot observe what is the post-treatment period for firms in the control group), which makes it hard to make the 'post' dummy, right?

      This time I will add a dataex sample including all the variables that I have mentioned so far.
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str30 name float year str3 ExternalAssurance str17 LevelofAssurance float(roa ExternalAssurancedum firmsize) long(firm Sector_n)
      "3I INFRASTRUCTURE"          2012 "No"  ""                     .05614281 0    1428.5  1 14
      "3I INFRASTRUCTURE"          2013 "No"  ""                    .063511945 0    1117.9  1 14
      "3I INFRASTRUCTURE"          2014 "No"  ""                     .19232987 0    1387.2  1 14
      "3I INFRASTRUCTURE"          2015 "No"  ""                      .1240206 0    1340.1  1 14
      "3I INFRASTRUCTURE"          2016 "No"  ""                     .07677372 0    1905.6  1 14
      "AAREAL BANK"                2013 "No"  ""                    .002722133 0     42981  5 14
      "AAREAL BANK"                2014 "No"  ""                    .005892205 0     49557  5 14
      "AAREAL BANK"                2015 "No"  ""                    .006525756 0     51948  5 14
      "AAREAL BANK"                2016 "Yes" "Limited/ Moderate"   .004171208 1     47708  5 14
      "ABENGOA"                    2012 "Yes" "Reasonable/ High"    .006104115 1  20545.32  8  7
      "ABENGOA"                    2013 "Yes" "Reasonable/ High"    .004823938 1 21152.846  8  7
      "ABENGOA"                    2014 "Yes" "Reasonable/ High"    .005850611 1 25246.596  8  7
      "ABENGOA"                    2015 "Yes" "Reasonable/ High"    -.07189515 1   16627.2  8  7
      "ABENGOA"                    2016 "Yes" "Limited/ Moderate"    -.4312886 1  9913.954  8  7
      "ABENGOA"                    2017 "Yes" "Limited/ Moderate"     .7201343 1  6358.597  8  7
      "ABN AMRO HOLDING"           2012 "No"  ""                   -.014079545 0     70954 10 14
      "ABN AMRO HOLDING"           2013 "No"  ""                   -.003391278 0     39808 10 14
      "ABN AMRO HOLDING"           2014 "Yes" "Limited/ Moderate"   -.00488302 1     23551 10 14
      "ABN AMRO HOLDING"           2015 "Yes" "Limited/ Moderate" -.0017520806 1     15981 10 14
      "ABN AMRO HOLDING"           2016 "No"  ""                    .006722689 0      8330 10 14
      "ACCOR"                      2013 "No"  ""                    .017705383 0      7060 13 34
      "ACCOR"                      2014 "No"  ""                     .02592804 0      8755 13 34
      "ACCOR"                      2015 "Yes" "Limited/ Moderate"   .023232436 1      8953 13 34
      "ACCOR"                      2016 "Yes" "Not specified"      .0088503035 1     11864 13 34
      "ACEA"                       2012 "Yes" "Reasonable/ High"    .009964245 1   6818.68 14 12
      "ACEA"                       2013 "Yes" "Limited/ Moderate"   .020027226 1  7087.352 14 12
      "ACEA"                       2014 "Yes" "Limited/ Moderate"   .023505114 1  6911.645 14 12
      "ACEA"                       2015 "Yes" "Limited/ Moderate"    .02609106 1  6706.972 14 12
      "ACEA"                       2016 "No"  ""                     .03799535 0  6904.713 14 12
      "ACEA"                       2017 "No"  ""                      .0244575 0  7387.591 14 12
      "ACTELION"                   2014 "No"  ""                     .21609704 0  2747.821 16 17
      "ACTELION"                   2016 "No"  ""                     .33059275 0  2106.477 16 17
      "ADVA AG OPTICAL NETWORKING" 2012 "Yes" "Not specified"        .05886059 1   284.095 19 31
      "ADVA AG OPTICAL NETWORKING" 2013 "No"  ""                     .01879022 0   293.078 19 31
      "ADVA AG OPTICAL NETWORKING" 2014 "No"  ""                     .02578216 0   324.837 19 31
      "ADVA AG OPTICAL NETWORKING" 2015 "No"  ""                    .068571135 0   391.535 19 31
      "ADVA AG OPTICAL NETWORKING" 2016 "Yes" "Reasonable/ High"     .04601917 1   467.892 19 31
      "ADVA AG OPTICAL NETWORKING" 2017 "No"  ""                   -.009111696 0   464.019 19 31
      "AIR FRANCE KLM"             2012 "No"  ""                    -.04338647 0     27474 22  3
      "AIR FRANCE KLM"             2013 "No"  ""                   -.067065254 0     25423 22  3
      "AIR FRANCE KLM"             2014 "No"  ""                    -.00835127 0     23230 22  3
      "AIR FRANCE KLM"             2015 "Yes" "Limited/ Moderate"   .004285408 1     23335 22  3
      "AIR FRANCE KLM"             2016 "No"  ""                     .02167277 0     22932 22  3
      "AIR FRANCE KLM"             2017 "No"  ""                    -.01192232 0     24408 22  3
      end
      label values firm firm
      label def firm 1 "3I INFRASTRUCTURE", modify
      label def firm 5 "AAREAL BANK", modify
      label def firm 8 "ABENGOA", modify
      label def firm 10 "ABN AMRO HOLDING", modify
      label def firm 13 "ACCOR", modify
      label def firm 14 "ACEA", modify
      label def firm 16 "ACTELION", modify
      label def firm 19 "ADVA AG OPTICAL NETWORKING", modify
      label def firm 22 "AIR FRANCE KLM", modify
      label values Sector_n Sector_n
      label def Sector_n 2 "Automotive", modify
      label def Sector_n 3 "Aviation", modify
      label def Sector_n 7 "Conglomerates", modify
      label def Sector_n 8 "Construction", modify
      label def Sector_n 12 "Energy Utilities", modify
      label def Sector_n 14 "Financial Services", modify
      label def Sector_n 17 "Healthcare Products", modify
      label def Sector_n 20 "Logistics", modify
      label def Sector_n 22 "Metals Products", modify
      label def Sector_n 23 "Mining", modify
      label def Sector_n 30 "Technology Hardware", modify
      label def Sector_n 31 "Telecommunications", modify
      label def Sector_n 34 "Tourism/Leisure", modify

      Comment

      Working...
      X