Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating huge number of fixed effects

    Hi everyone,

    I have a large firm level panel with 1 million units over 20 years. I want to run a diff-in-diff with the recent Chaisemartin and Haultfeuille estimator which does not allow the use of factor variables (their command did_multiplegt does not). How can I create such a large number of fixed effects at the firm level across time and separately across counties? I can to run a loop but it's so computationally demanding that I cannot run it even on the server. The code for county and industry fixed effects is below (less number of fixed effects). The problem is that the number of firms is too large and when I modify the same code for firms, it's doesn't run (not enough memory).

    Thank you!

    Code:
        forval q = 1/50 {    
            * For each county
            forval c = 1/279 {
                gen county_year_`q'_`c' = year_dummy`q'    * county_dummy`c'
            }
            * For each industry
            forval i = 1/50 {
                gen industry_year_`q'_`i' = year_dummy`q' * industry_dummy`i'
            }    
        }

  • #2
    Do you really want to estimate all these coefs for the FEs? Otherwise you can use areg and get rid of them, which is easier on the hardware.
    Best wishes

    (Stata 16.1 MP)

    Comment


    • #3
      I agree with Felix
      Many of the new estimators already use a kind of year*control interaction in the specification, so there is no need to add them to the model specification.
      Furthermore, so many dummies may most likely overfit the model, and will not allow you to control for anything.
      Perhaps you need to restate your research question/design?
      HTH

      Comment


      • #4
        Hi both,

        Thanks for answering! We're trying to understand the impact of environmental inspections on investment. The problem is that the inspections are endogeneous and we do not have the precise model used by the regulator to decide who gets inspected. Hence, we want to look at within firm variation only because this controls for a lot of unobservables that could bias our estimates (and as well as use industry*time FE or county*time). To give you an idea why we want this estimation: we will have firms inspected only once and some inspected multiple times or varying sizes and we cannot see firm size in our data.

        FernandoRios: I know you work with Callaway and Sant Anna in particular but I am using C&H because the inspections occur at multiple times, so treatment turns on and off at various points and C&H offer this flexibility.

        Thanks.

        Comment


        • #5
          Re-upping because I still have no solution. Thanks!

          Comment


          • #6
            Given the specific nature of your question, perhaps would be a good idea to contact the authors of the paper itself.
            They may offer guidance on how flexible is their method for addressing multiple high dimensional fixed effects .
            You may also want to consider regression approach methods, which allow you to be more flexible in terms of adding fixed effects

            Comment


            • #7
              Thanks, Fernando!

              Comment

              Working...
              X