Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in difference regression

    Does anybody know what (panel data) model is used here?
    Id does not look like a notmal DiD regression to me, can anybody help me what kind of DiD regression they have used here ? :

    𝑦m,r,t = 𝛼 + 𝛾𝐷m,t+ 𝛽𝑋m,t + 𝛿r + 𝜎t + πœ€m,r,t
    Where Ym,r,t is the outcome of interest for municipality m in region r at time t. Ξ± is a constant.
    Dm,t is an indicator variable equal to zero for a municipality before it receives a hydroelectric
    power plant, and equal to one after one is built. Xm,t is a set of municipality specific
    characteristics that are time variant, including doctors per capita, the number of people in
    poverty, and taxable income per taxpayer. Ξ΄r is a set of indicator variables at regional level,
    allowing for differences in levels between the five regions.

    Thanks in advance

  • #2
    Ludmila:
    I think you might get clarity on this by reading
    1. Wooldridge, J.(2013). Introductory Econometrics: A Modern Approach. 5th Edition. South Western. Particularly Chapter 13.2 Policy Analysis with Pooled Cross Sections. Or the paper cited in that section:
    2. Kiel, K. A., and K. T. McClain (1995), β€œHouse Prices during Siting Decision Stages: The Case of an Incinerator from Rumor through Operation,” Journal of Environmental Economics and Management 28, 241–255.

    Comment


    • #3
      Thanks Abdul,
      I have red "Wooldridge, J.(2013). Introductory Econometrics"
      i dont think its about Pooled Cross section. It is panel data in the example above, what I dont uderstant is dummy variable "D " why is it only one dummy used in a DiD regression

      Comment


      • #4
        Ludmila, what do the data look like? Do you have the same municipalities over time? If so, we could certainly call it panel data. I admit I am still not completely clear what you are asking, though.

        Comment


        • #5
          This does not look like a difference-in-differences model to me. The variable D appears to be simply a pre-post indicator variable. There does not appear to be a control grouop. Rather this looks like a simple pre-post comparison in municipalities that received a hydroelectric plant.

          Comment


          • #6
            Thanks Clyde for always answering my (not so clear) questions
            I have another less complicated question.
            In the dataex below you see variables:

            ARSTALL = Year (time variable)
            depid= Department ID
            facid= Faculty ID
            public= Number of publications per year in departments

            1. How can create control and treatment group? (I tried to generate dummy, but for some reason it didn't work)
            The control group has to be all departments with "facid = 3 and 4" , the rest of departments are treatment groups. The outcome (dependent) variable is "public" .

            2. How can I control for different trends in departments?

            Thanks in advance

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input int ARSTALL float(depid facid public)
            2005  1 13   6
            2006  1 13   5
            2008  1  3   1
            2010  1 10   2
            2011  1  8   7
            2012  1  1   6
            2013  1 12   3
            2014  1  1   6
            2015  1  4   6
            2016  1  4   1
            2005  2 14   2
            2006  2 14   2
            2007  2 14   6
            2008  2 14   2
            2011  2  3   1
            2012  2  3   1
            2014  2  3   1
            2016  2  3   1
            2008  3 14   2
            2006  4 14  17
            2007  4 14  15
            2008  4 14  11
            2010  4 14   7
            2006  5  5   1
            2014  5 11   1
            2015  5 13   1
            2005  6  3  28
            2006  6  3  32
            2007  6  3  38
            2008  6  6  28
            2009  6  3  34
            2010  6  3  13
            2011  6  3  25
            2012  6  3  42
            2013  6  3  40
            2014  6  3  42
            2015  6  3  32
            2016  6  3  48
            2005  7  3  31
            2006  7  3  32
            2007  7  6  30
            2008  7  3  29
            2009  7  3  39
            2010  7  3  42
            2011  7  3  52
            2012  7  3  58
            2013  7  3  48
            2014  7  3  61
            2015  7  3  33
            2016  7  3  77
            2005  8  6   2
            2006  8  6   1
            2007  8  6   5
            2008  8  6   4
            2012  8  6   4
            2013  8  6   3
            2014  8  6   6
            2015  8  6   3
            2016  8  6   6
            2005  9  4  48
            2006  9  4 134
            2007  9  4  74
            2008  9  4 103
            2009  9  4 102
            2010  9  4 102
            2011  9  4  96
            2012  9  4 100
            2013  9  4 107
            2014  9  4  75
            2015  9  4 120
            2016  9  4 101
            2005 10  4  49
            2006 10  4  71
            2007 10  4  35
            2008 10  4  61
            2009 10  4  65
            2010 10  4  68
            2011 10  6  79
            2012 10  4  72
            2013 10  6  15
            2014 10  6  14
            2015 10  6  13
            2016 10  6  14
            2005 11  4  23
            2006 11  4  34
            2007 11  4  32
            2008 11  4  34
            2009 11  4  51
            2010 11  4  47
            2011 11  4  93
            2012 11  4  49
            2015 11  4   3
            2005 12  4  74
            2006 12 14 110
            2007 12  4 150
            2008 12  4 160
            2009 12  4 187
            2010 12  4 184
            2011 12  4 135
            2012 12  4 142
            end

            Comment


            • #7
              obs: Post treatment time is 2005 and the treatment happens in 2006

              Comment


              • #8
                1. How can create control and treatment group? (I tried to generate dummy, but for some reason it didn't work)
                The control group has to be all departments with "facid = 3 and 4" , the rest of departments are treatment groups. The outcome (dependent) variable is "public" .
                I'm afraid "I tried to generate dummy, but for some reason it didn't work" isn't helpful: you don't show the code you tried, nor do you show what you got as a result, nor why it wasn't what you wanted. So I can't really help you fix what you did. In this case, however, it is easy enough to just give you code that does this:

                Code:
                gen byte arm = !inlist(facid, 3, 4) if !missing(facid)
                label define arm 0 "Control" 1 "Treatment"
                label values arm arm
                2. How can I control for different trends in departments?
                This question is much too broad to answer. Please ask a more specific question.

                Comment


                • #9
                  Difference in differences can be generalized to a two way fixed effects model by dropping the treatment group and post treatment indicators and adding group and year fixed effects.

                  traditional diff in diff is:


                  Code:
                  reg y treatment post  treatment*post

                  this can be generalized to two way fixed effect by:

                  Code:
                  xtreg y treatment*post i.year, fe
                  the group fixed effect soaks up all time invariant characteristics of the group, including treatment status. The year effects soak up all year specific characteristics including post treatment status.

                  Econometrics is moving towards the generalized two-way fixed effect diff in diff, as it is more robust. Using year effects instead of a post treatment dummy is always necessary with more than two periods or the result will be biased (see Wooldridge). However more people are generalizing the treatment fixed effect to a group fixed effect. Its probably best to provide both models.

                  one which would be

                  Code:
                  reg y treatment  treatment*post i.year
                  and one which would be

                  Code:
                  xtreg y treatment*post i.year, fe

                  Comment


                  • #10
                    I was refering to " common trends assumption" in DiD .
                    How to relax the common trent assumption when control and treatment groups don't have "paralell " tredns over time

                    Comment


                    • #11
                      Thank you very much Philip

                      Comment


                      • #12
                        Originally posted by Ludmila Farooq View Post
                        I was refering to " common trends assumption" in DiD .
                        How to relax the common trent assumption when control and treatment groups don't have "paralell " tredns over time
                        The parallel trends assumption can be relaxed by adding group specific polynomial time trends to the model.

                        The parallel trends assumption requires that trends have the same slope or first derivative. Adding a group specific linear time trend changes the assumption such that trends may now differ in the first derivative but must be similar in the second derivative. This is the same as saying that trends can diverge provided they are linear trends. They cannot differ in the quadratic dimension, which would be likely provided there was no shock during the pretreatment period that only affected treatment groups.

                        You can relax the assumption further by adding both group specific linear trends and group specific quadratic trends. Now trends can differ in the first and second derivative but not the third.

                        See mora and reggio (2017) flexible diff in diff with alternate parallel assumptions (working paper available online.)

                        Comment


                        • #13
                          Thank you Philip
                          Could you please give me the codes for " group specific polenomial time trends"
                          (Couldnt find the paper by the way)

                          Comment


                          • #14
                            The working paper is mora and reggio 2012, treatment effect identification using alternate parallel assumptions.

                            To implement polynomial time.trends you need to create a time trend variable which is coded 1 for the first year in your panel, 2 for the second year...n for the nth year. You then add both fixed effects and an interaction of that variable and your panel id dummies or your treatment group dummy. The second approach uses leas degrees of freedom and may more easily yield a significant coefficient.

                            For higher degree polynomial trends, square your time trend in a new variable and include the squared variable interacted with your panel.or group dummies to the model with linear trends. It's important to include fixed effects and all lower degree terms in these models as you add higher degree trends.

                            Comment


                            • #15
                              Philip Gigliotti I notice an acceleration in your postings here on the Forum. Thank you for your contributions. One suggestion (not originally mine, it's in the FAQ): when giving a reference give the complete reference information (or provide a link if it's an online source). Maybe everybody in econometrics knows what mora and reggio 2012 (or is it 2017, or are those two different papers by the same authors) is, and probably Ludmila Farooq will, too, as it seems like she is in your field. But these posts are read by other people who may be interested in learning the methodology. So as a courtesy to them and to make the Forum maximally helpful to those who read it but don't post, it is best to show complete references.

                              One suggestion regarding polynomial trends of higher than quadratic degree. If your T is at all large, when you get to T3 and higher powers, you will be generating variables with some very large numerical values. Combined with other variables in the same data set that have smaller values, these can lead to numerical difficulties when Stata tries to estimate the models and may, ultimately, cause convergence failure. So when using variables like that it is a good idea to either re-scale them, or use T centered at some reasonable value (mean, median, etc.) and its powers for the time trends. The centering approach has the additional slight advantage that it reduces colinearity among the time trend powers, so that if estimation of those effects is directly of interest, you can do that with greater precision. (Admittedly, usually these trends are incorporated for adjustment purposes only, but sometimes they are of interest.)

                              Comment

                              Working...
                              X