Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is a GROUP variable in difference in differences analysis?

    Query: What is a GROUP variable in difference in differences analysis?

    Study: Newspaper articles from 1985-2017 were categorized as either in favor, against, or neutral toward a specific topic. These are weighted and graphed.
    Hypothesis: An event happened around 2007 that triggered a change in attitudes. I'd like to test this with a DID analysis.
    Unit of Analysis: Newspapers for each year.

    Outcome Variable: Favor -defined as newspaper articles with a favorable stance
    Treatment: Treatment -defined as 0/1 "before 2008" and "2008 and on"
    Time Variable: Year -defined as years 1985-2017
    Group: ???

    Syntax:didregress (Favor) (Treatment), group(?????) time(Year)

    DID Model: See the image of the model
    Click image for larger version

Name:	DID model.png
Views:	2
Size:	46.4 KB
ID:	1687360





    A Control Variable?: I don't know if this is needed for DID. I created a dummy variable of the mean of Favor and Against.


    Example Data:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int Year byte(Favor Against Neutral Treatment Control)
    1985  29 71  0 0 50
    1986  16 44 40 0 30
    1987  43 43 14 0 43
    1988  51 27 22 0 39
    1995  60 10 30 0 35
    1996  33 57 10 0 45
    1997  30 69  1 0 50
    1998  50  0 50 0 25
    1999  44 31 25 0 38
    2000  42 40 18 0 41
    2001  80 20  0 0 50
    2003  80 20  0 0 50
    2004   0 75 25 0 38
    2005  38  2 60 0 20
    2006 100  0  0 0 50
    2007  15 55 30 0 35
    2008  19 68 13 1 44
    2009  44 38 18 1 41
    2010  72 14 14 1 43
    2016  88 12  0 1 50
    2017  82  9  9 1 46
    2018  79  7 14 1 43
    end

    What is a GROUP variable?
    Last edited by Sara Bruene; 30 Oct 2022, 14:06.

  • #2
    The simple form of difference-in-differences looks at a double difference: one difference is the time dimension -- before / after, the other dimension is the group -- there needs to be a group that was never treated, and another group that was not treated before the event, but was treated after the event.

    You description is only about before vs after -- I don't see how you have a treatment group and a control group. Are you sure your setup is even appropriate for diff-in-diff?

    Comment


    • #3
      Thank you for your reply. I agree that it requires a "never treated" group. For that, I created a new dummy variable (Control) defined as the mean between two other groups (the mean of Favor and Against variables). It is listed as the last column of the dataset and it is displayed as the gold line on the model above. Is this what Stata wants for "group?" The syntax is didregress (Favor) (Treatment), group(?) time(Year) but Stata says that the group variable should be categorical. Right? Maybe, rather than a continuous variable (#articles). And this is what confuses me. I think the group categories need to be something like, "this group treated" and "this group not treated." But I don't know how to do that with time-series, which is exactly what DID is supposed to do, right?
      Last edited by Sara Bruene; 30 Oct 2022, 18:14.

      Comment


      • #4
        No, this is conceptually muddled. The attitude of the newspaper cannot both be the outcome and the treatment. Just artificially creating multiple variables for attitude does not overcome the basic conceptual problem.

        Comment


        • #5
          Is that due to too much correlation between the outcome variable "favor" and the control variable "the mean of favor/against"? If that's what you're suggesting, I think that makes sense to me now. Thank you.

          1. Would it work to create a different control variable that isn't connected (correlated) to the attitude? e.g. a specific point (like -0.4) to create the gold line in the model above?
          2. Looking at the model above, there is a clear divergence in the favor against attitudes around 2007. If the DID analysis doesn't work to test the divergence, do you have a suggestion for a better analysis? ARIMA?
          3. I know I can run a t test, but I'd like something more advanced.

          Comment

          Working...
          X