Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • mlogit not converging in fully interacted model

    Dear Statalists,
    I am running a multinomial logit (mlogit) where my y has 3 possible values. I would like to regress my y on a fully interacted model, i.e. all my regressors are interacted with a dummy variable, D.
    I do the following:

    Code:
    mlogit y D
    mlogit y c.VAR1##D
    mlogit y c.VAR1##D i.VAR2##D
    mlogit y c.VAR1##D i.VAR2##D year##D
    At the 4th specification (when I add year##D) the iterations just keep going forever, saying ‘(not concave)’. I left it run overnight and iterations continue with always less or more the same value of the log likelihood (so I do not think it is a matter of time).
    If I just insert year (not interacted with D), it converges in few iterations and I get the results, so the problem is this interaction. I checked that there are no singletons (there are many observations for any possible year##D value, minimum 400).
    Any suggestion?
    PS the dataset is around 35,000 obs but I am planning to run this code on a 3,000,000 obs dataset. Also, I use Stata15 but I could also use Stata17. Just in case any of this is relevant.

  • #2
    One possibility is that year##D involves so many terms as to define a model too complex for what your data set can support. (This is not just a matter of sample size, but if e.g. very few observations have the combination year == 2020 and D == 1, what you want may not be estimable.) Does your year variable have a lot of values? You might inspect -tabulate year D - and see if there are any sparse cells. I have also occasionally seen weird problems with categorical response models involving a continuous year variable as a predictor, and was able to solve these by recoding the year variable to small integers (2000 becomes 0, 2001 becomes 1, ... .).

    Without going in the details, I'd advise that when you want Stata to treat a variable as continuous, use c.whatever and whenever you want it treated as categorical use i.whatever. I'm not saying that's a source of problems here, but it's a good practice for various reasons.

    Comment


    • #3
      Thanks a lot Mike.

      I tried to recode my years to small numbers (1, 2, 3, ect) but it does not work. From the first iteration it says "not concave". I think the dataset is not sparse, here below the tabulate of year D (before the recoding of year):
      Code:
      . tab year D
       
                 |           D
            Year |         0          1 |     Total
      -----------+----------------------+----------
            2008 |     1,012        406 |     1,418
            2009 |     1,281        744 |     2,025
            2010 |     1,594        727 |     2,321
            2013 |     2,977        775 |     3,752
            2014 |     3,519        855 |     4,374
            2015 |     4,945      1,183 |     6,128
            2016 |     5,968      1,399 |     7,367
            2017 |     6,330      1,431 |     7,761
      -----------+----------------------+----------
           Total |    27,626      7,520 |    35,146
      Things also do not change when I write i.year (but I agree with your suggestion of putting always either c. or i.).

      Any other suggestion??
      Thanks a lot!

      Comment


      • #4
        Claire:
        regardless of the nature of the problem (that sometimes is difficult to spot), the usual recipe is to start it all over again, adding one predictor at a time and see when Stata starts to choke.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          Dear Carlo,

          thank you for your reply. Everything works up to and included the specification:

          Code:
          mlogit outcome D,  b(0) 
          
          mlogit outcome c.VAR1##D, b(0)
          
          mlogit outcome c.VAR1##D i.VAR2##D,  b(0)
          If I add as predictor year as a series of dummy (i.year) as well as if I insert it as an interacted dummy (i.year##D), Stata stops converging and tells me "Not concave".

          But if instead I add year as continous (c.year), as well as if I insert it as an interacted continous (c.year##D) Stata converges and gives me the results.

          I am not interested in the "year" coefficients (which btw has a 2 years gap for all observations). Do you think I can just go with year as a continous?

          Like:
          Code:
          mlogit outcome c.VAR1##D i.VAR2##D c.year##D, b(0)
          Thanks a lot!

          Comment


          • #6
            Claire:
            yes.
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment

            Working...
            X