Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Converting Dataset for Dif-in-Dif Estimation

    Hello,

    I am struggling with setting up/converting my current dataset set-up in order to run a DiD estimation.

    I have annual data for a number of years (unbalanced), per entity. For each entity I have their respective method of financial reporting as the binary DV variable (GAAP=1;Non-GAAP=0), their pension funded ratio is a continues variable (IV) and a number of other control variables for each year observed.

    In my original model, I run a logistic regression with DV as the method of reporting and IV of interests as the funded ratio + controls. In simple terms: I am testing whether higher funded ratio is associated with increased likelihood of following GAAP.

    Because entities could switch/change their method of reporting from GAAP -> Non-GAAP and the other way around, it was suggested I also run a difference in difference model. But I am not sure how I can set it up for Stata to be able to attempt the dif-in-dif estimation.

    As I see it, I essentially have 4 groups within my 11 years of data: those that are always GAAP, those that are never GAAP, those that switched to GAAP, and those that witched from GAAP. There is no specific point in time (or one-year regulation) when they have to make the switch, it could vary from entity to entity.

    Specific Questions:
    -How do I recode it for purposes of the “control” and “treatment” groups?
    -How do I convert to great the “pre” and “post” time?
    -What if my outcome variable is binary?

    Any thoughts, suggestions or directions towards materials describing similar set-up are greatly appreciated.

  • #2
    You didn't get a quick answer. You'll get a better response if you follow the FAQ on asking questions - provide Stata code in code delimiters, Stata output, and sample data using dataex.

    I'm not a DID user, but it seems to me like there is a confusion between the dv and iv here. Normally, the first difference you refer to in DID means something happens at a given time to part of the sample. Then you can estimate the effect based on the change in an outcome variable in that part of the sample relative to the change in the rest of the sample over the same time. But here, your change seems to be the outcome, not an explanatory variable.

    I could have this completely wrong but I don't see this as lending itself to a DID.

    Comment

    Working...
    X