Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mlogit/Mprobit/Logit

    HI, I am writing my thesis, but I don't very much about tests in STATA. I am looking for urgent assistance regarding model testing, test verification, and selection criteria for discrete choice models. I'm specifically having trouble determining the applicability of various tests for mlogit, mprobit, and logit models. If anyone is available to help or discuss this here or via DM, I would truly appreciate it. It is a bit of a time-sensitive situation.

  • #2
    mprobit, as it is implemented in Stata, has no advantages over mlogit, but tends to more often suffer from convergence problems. So in virtually all situations you can ignore mprobit​. logit is for dichotomous dependent variables, while mlogit is for dependent variables with three or more categories. So the choice between those three models does not require testing.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      As an additional note, there are some more models, specifically for discrete choice, see: https://www.stata.com/manuals/cmintro5.pdf
      Regarding selection, what about AIC/BIC? Type
      Code:
      estat ic
      after the model estimation for the results.
      Best wishes

      Stata 18.0 MP | ORCID | Google Scholar

      Comment


      • #4
        So I’m calculating the mismatch and have three categories (overeducation = 1, match = 0, undereducation = -1), so I’ll use mlogit, but apparently I need to test whether mprobit might be better by comparing AIC and BIC using estat ic.
        Click image for larger version

Name:	Mprobitlogit.png
Views:	1
Size:	48.3 KB
ID:	1785747


        Based on the estat ic test, I will choose lower-order models, specifically mlogit. Within the mlogit framework, I am building three models in which I gradually add variables such as gender, age, household income, parents’ education, country, etc. What other tests should I perform to verify that the model is correct? As part of the robustness check, I ran a logit model for overeducation (0/1) with all variables and did the same for undereducation (0/1). Do I also need to verify whether to use logit or probit?

        Comment


        • #5
          I am not aware of many other tests you can apply here. You can look at Pseudo R² so see whether variance is explained, in general. Also, you need theoretical arguments that outline why certain variables are useful to include.
          Best wishes

          Stata 18.0 MP | ORCID | Google Scholar

          Comment


          • #6
            The inclusion of these variables is based on a review of the empirical literature, which examined the same or similar variables as determinants of educational mismatch.

            Comment


            • #7
              Originally posted by Damian Chroboczek View Post
              I’ll use mlogit, but apparently I need to test whether mprobit might be better by comparing AIC and BIC using estat ic.
              That is incorrect. The difference between mlogit and mprobit is so small (in the way Stata implemented mprobit) that the difference is a non-issue: You would need huge sample sizes to be able to reliable detect the difference, and if you do, the difference is going to be so small that it is substantively completely meaningless. So testing in this scenario makes absolutely no sense whatsoever. If I see someone testing these differences, then that proofs one thing, and one thing only: the authors do not understand the models they are estimating. Such a result does not increase my confidence in that article.

              Originally posted by Damian Chroboczek View Post
              As part of the robustness check, I ran a logit model for overeducation (0/1) with all variables and did the same for undereducation (0/1). Do I also need to verify whether to use logit or probit?
              To make the results of those models comparable you need to create your variables a bit differently. I will assume that your mlogit model uses "match" as the baseoutcome, i.e. you added the option baseoutcome(0) to your mlogit model. In that case you create two variables, lets call them over and under. The variable over will contain 1 if overeducated, 0 for match and missing (.) if undereducated. Similarly, the variable under will contain 1 if undereducated, 0 if match, and missing if overeducated. Your two logit models will now give very similar results as your mlogit model. However, given that we know that the models give very similar results even before we estimate them, means that they are not really useful as a robustness check.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                When it comes to logit and probit, most people just use whatever is commonly used by others in their field.

                I almost always use logit, but this handout discusses why you might sometimes use something else.

                https://academicweb.nd.edu/~rwilliam/stats3/L09.pdf

                An excerpt:

                Long (1997, p. 83) says that the choice between the logit and probit models is largely one of

                convenience and convention, since the substantive results are generally indistinguishable. But, in

                some cases, the need to generalize a model may be an issue. For example, multiple-equation

                systems involving qualitative dependent variables are based on the probit model (e.g. see

                biprobit). For models with nominal dependent variables that have more than 2 categories, the

                logit model (estimated by mlogit) may be preferred because the corresponding probit model

                (estimated by mprobit) is too computationally demanding. For panel data, you can estimate a

                fixed effects model with logit but not with probit.
                -------------------------------------------
                Richard Williams
                Professor Emeritus of Sociology
                University of Notre Dame
                StataNow Version: 19.5 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://academicweb.nd.edu/~rwilliam/

                Comment

                Working...
                X