Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cross-classified random effects model for event studies with firms and event dates in Stata

    Hello All,

    I am working on an event study using panel data where multiple firms (identified by firm_id) appear in multiple events (identified by eventdate), and each event can involve multiple firms. Therefore, the data structure is cross-classified, not hierarchical — i.e., firms are not nested in events, and events are not nested in firms.

    So the data structure is not hierarchical (i.e., nested) but crossed: both firm and events are separate, non-nested grouping factors that influence cumulative abnormal returns (CAR). Instead of assuming that firms are nested within events (or vice versa), I should allow both to have independent random intercepts (hence, cross-classified model), and I should account for the fact that a given firm may respond to many events, and a given event may affect many firms. I need to model these effects separately, rather than treating one as nested within the other.

    My goal is to estimate how certain covariates (e.g., X1, X2, X3 ) affect CAR.

    Based on my understanding, I should specify a crossed random effects model with random intercepts for both firm_id and eventdate, like this:
    Code:
    mixed CAR X1 X2 X3 || _all: R.eventdate || firm_id:
    I tried to run this command, but Stata displayed the following result:

    Code:
    Performing EM optimization ...
        _mixed_decomp_hier():  3900  unable to allocate real <tmp>[868515,6796]
          _xtm_mixed_ll_bi():     -  function returned error
    failed to allocate a 868515 x 6796 real matrix
        _mixed_decomp_hier():  3900  out of memory
          _xtm_mixed_ll_bi():     -  function returned error
            _xtm_em_iter_u():     -  function returned error
              _xtm_em_iter():     -  function returned error
                     <istmt>:     -  function returned error
    I would be grateful if you could advise what the correct Stata command is to specify a cross-classified random intercept model in this context using mixed command; and whether there are any important considerations (e.g., data structure or convergence issues) when using mixed for crossed random effects in large panels?


    Thank you in advance,
    Nick
    Last edited by Nick Baradar; 30 Jun 2025, 07:26.

  • #2
    Hi Nick,

    With cross-classified models in Stata, the standard guidance is to treat the random effect with fewer levels as the _all: R. random factor. With this error, it seems likely that you have a lot of unique eventdates. If indeed you have fewer firm_ids than eventdates, then try running the model with the random effects reversed:
    Code:
    mixed CAR X1 X2 X3 || _all: R.firm_id || eventdate:
    Do you get estimates with that specification. If not, you may need to somehow reduce the number of eventdates to estimate the model. Ideally doing so in a principled way.

    Another option that may help you is Ben Jann's supclust, it creates superordinate categories based on the values of two or more classification variables. If it works, you can use the single super cluster as your random intercept.
    Code:
    ssc install supclust
    Last edited by Erik Ruzek; 30 Jun 2025, 09:48. Reason: Added info about supclust

    Comment


    • #3
      Thank you for the clarification, Erik Ruzek ! I have more event dates than firms, so I believe the initial command was specified correctly. For exploratory purposes, I also tried switching the order of the firm ID and event date, but I encountered the same error. Additionally, I reduced the number of observations from 2.5 million to 350 thousand, yet the error persisted. I checked the variation in the variables and found no apparent issues, also conducted a VIF analysis - no multicollinearity issues, and I removed any variables with correlations greater than 0.5, but I still couldn’t obtain estimates. I’m unsure why the command doesn't produce results.

      Comment


      • #4
        The error message itself is telling you the problem:
        Code:
        failed to allocate a 868515 x 6796 real matrix
        _mixed_decomp_hier(): 3900 out of memory
        To solve your problem, Mata needs to allocate that huge matrix, which is over 40 GB. If you had that much available memory and your OS were willing to provide it to the process, the calculation could proceed. It is not directly related to the number of observations in your data set. It is more related to the number of crossed effects in your model. The bottom line is that you cannot estimate a model with that many crossed effects on your current environment. You will need to move it to a machine where a request for 40GB of memory for the process is feasible, and an OS that will agree to that.

        Comment


        • #5
          Dear Clyde Schechter,thank you very much for your input!

          I am using a laptop with 16 GB of RAM, and when I checked the Task Manager, it shows that Stata is only using about half of the available memory. I even tried regressing the dependent variable on a single independent variable - both of which have very large variation relative to their means - but Stata still does not complete the EM optimization, even after running for 10 hours. As this approach does not seem to be working in my case, could anyone please recommend an alternative methodology that could be used to estimate my model?

          Thank you,
          Nick
          Last edited by Nick Baradar; 02 Jul 2025, 09:48.

          Comment


          • #6
            I even tried regressing the dependent variable on a single independent variable - both of which have very large variation relative to their means - but Stata still does not complete the EM optimization, even after running for 10 hours.
            Well, just because the EM optimization is taking a long time, that doesn't mean that the calculations won't eventually reach a conclusion. Multi-level models with large numbers of random effects (and with crossed random effects, you get a combinatorial explosion in the total number of random effects) are very computationally intensive and can take a long time. With data sets around the size of the one you are working with, I have seen such models take between 2 and 3 weeks to reach conclusion on my setup, a medium-end Windows box. Given how long the EM has run, I think that's what you are in for. It may or may not be worth the wait: that's up to you.

            I have one thought that may improve things. Because you are using crossed effects, you are forcing Stata to allocate a random intercept for each pair of firm and event. Although the levels are not nested, perhaps each firm involves only a small number of events and each event involves only a small number of firms. In that case, you are forcing Stata to estimate a large number of random effects for pairings of firm and event that never occur. In other words, rather than a fully crossed model, perhaps what you really have is a sparse multiple membership model. In that case, you would probably benefit from creating a new variable that indicates the pair of event and firm, i.e. -egen effect = group(firm event)-, and just use that as your second model level. This will not enable you to separately estimate firm effects and event effects--so, if that is your purpose, this approach is pointless. But if you don't need those things, then this model may run much more smoothly.
            Last edited by Clyde Schechter; 02 Jul 2025, 10:53.

            Comment


            • #7
              Thank you, Clyde Schechter ! Am I correct that the command for grouping by firm and event (effect) combination should be:
              Code:
               
               mixed CAR X1 X2 X3 || _all: R.effect

              Comment


              • #8
                Well, you could do it that way, but more efficient would be:
                Code:
                mixed CAR X1 X2 X3 || effect:

                Comment


                • #9
                  Thank you, Clyde Schechter, highly appreciate your support!

                  Comment

                  Working...
                  X