Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why probit works, but multinomial probit doesn't?

    In short, I estimate 4 probit regressions, examining the effects of a variable X on the probability of being employment (versus unemployed), on 4 worker groups:

    1. total sample
    2. sector a
    3. sector b
    4. sector c.

    In each model the command is the same:

    dprobit employed X

    and all the estimations are successfully carried out.


    Then I try to incorporate all 4 regressions in 1, i.e. to estimate a multinomial probit model, instead.

    In this case I examine whether variable X affects mobility between these sectors a, b and c, using an indicator variable (id), taking the values 1, 2 and 3, respectively.

    The command I use is:

    mprobit id X

    However, in all my attempts the model cannot be estimated. The results either indicate "Hessian not negatively semi-definite" or the iterations (not concave) just continue forever.

    May someone explain to me how this comes, the probit models to be estimated without problems, while the multinomial probit model is not? In the same exactly data set?


    Also, any suggestions how to solve this?
    Last edited by Pandelis Andreou; 15 Dec 2018, 09:56.

  • #2
    Multinomial probit and logit are for the case when you have a few options, such as whether you take a green, blue, or red bus to work.

    From what I get about your data, your subsamples are not overlapping. Nobody has all the options, you have the option of either taking no bus, or red bus, then another person has the option of taking no bus, or green bus, etc.

    Maybe this is why you cannot estimate the multinomial probit, because it is not appropriate for your data.

    Comment


    • #3
      Well, just because one model works, there is no guarantee that some other vaguely related model will also converge. There's no connection between them. So let's just focus on why your -mprobit- model is not converging.

      My suggestion is to rerun it specifying the -iterate(#)- option, with # being a number of iterations large enough to get you to the point where Stata keeps spinning its wheels in the not concave zone of the likelihood. Then Stata will stop and report what its results to that point look like. Those results are not valid model estimates, but they often show where Stata is getting into trouble. You may find that one of your variables has coefficients or standard errors that are outlandishly large (positive or negative). That would be a strong clue to suggest that that particular variable is causing the problem, and you could then investigate both its overall distribution (is one level very rare?) and its crude association with the outcome (do a cross-tab and you may find one of the cells is zero or very close to zero.)

      Comment


      • #4
        Indeed after limiting the iterations I saw some variables which caused problems and removed them.

        Two questions more:

        a) How may I choose the base category (model selected id=2, I wanted id=0).

        b) Is there a command, like dprobit in probit, that automatically estimates marginal effects for the mprobit estimations?

        Comment


        • #5
          a) If you read -help mlogit- you will see that there is a -baseoutcome(#)- option that does this.

          b) Not that I'm aware of. In any case, there is no need for such a command (nor for -dprobit- itself) because you can get these from the -margins- command. Assuming you are unfamiliar with -margins-, the best introduction to it is the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf. That will give you the general sense of how it works and some nicely worked examples. That particular file doesn't cover -margins- after -mlogit-, but really it works the same way after all estimation commands. For additional information, the PDF documentation that comes with your Stata installation has a very lengthy section about the command--it is very powerful and one of the best things in modern Stata. Do learn it.

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            ...
            b) Not that I'm aware of. In any case, there is no need for such a command (nor for -dprobit- itself) because you can get these from the -margins- command. Assuming you are unfamiliar with -margins-, the best introduction to it is the excellent Richard Williams' https://www3.nd.edu/~rwilliam/stats/Margins01.pdf. That will give you the general sense of how it works and some nicely worked examples. That particular file doesn't cover -margins- after -mlogit-, but really it works the same way after all estimation commands. For additional information, the PDF documentation that comes with your Stata installation has a very lengthy section about the command--it is very powerful and one of the best things in modern Stata. Do learn it.
            A minor point of information is that after mlogit, you can and probably should select which of the 3 outcomes you want the predicted probabilities for. For example, imagine that X is continuous, and Pandelis wants the predicted probability of being in sector A, varying the level of X from 0 to 20:

            Code:
            margins, predict(outcome(1)) at(X = (0(1)20))
            That assumes A is coded as 1 in the data. If you omit which outcome, I think Stata will default to the first outcome. Example 3 in the mlogit postestimation manual walks you through this.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment


            • #7
              In older versions of Stata, the default output for -margins- was, indeed, the first outcome category. But in version 15, the default is to give results for all outcomes. That said, if you are only interested in one of the outcomes, it makes sense to specify it so you don't have to sort through all the other output to find what you are looking for.

              Comment


              • #8
                Thanks Joro, Clyde and Weiwen for the informative replies. Fortunately, it will not be necessary to perform multinomial Probit in the current paper, since the 'plain' Probit analysis is adequate for the current data set.

                Comment

                Working...
                X