Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • [Choice modeling] Overspecifying a mixed logit model

    Dear Stata list,

    I'm working on a discrete choice experiment (DCE) and am interested by a recent paper from Crastes dit Sourd et al (2021) who presents different ways of specifying categorical variables in mixed logit models. Following the work by Walker (2001, 2002), the authors suggest a two-step approach.

    First, they suggest to estimate an over-specified model where all modalities are included. Second, they suggest to use modalities with the lowest variance as reference levels:

    Citing the authors: "The author reports that the correct bases can be found by estimating a preliminary, overspecified model in which means and standard deviations are estimated for all the categorical attribute levels, rather than all but one. For each categorical attribute, the level for which the estimated standard deviation is lowest must be chosen as the base in the next iterations of the modelling work."

    If I consider 2 alternatives and 3 attributes with 3 levels each. I have to specify (using the mixlogit command developed by Hole):

    mixlogit choice, rand(asc1 asc2 x1a1 x1a2 x1a3 x2a1 x2a2 x2a3 x3a1 x3a2 x3a3) group(case) id(id)

    Obviously, there is multicollinearity: "Some variables are collinear - check your model specification"

    Did I miss something? Do I have to "force" the command to run? Anyone could help me? The authors used R to perform modeling and published their codes in appendix.

    Many thanks all!

    Reference: https://papers.ssrn.com/sol3/papers....act_id=5324390
    Last edited by Gabin Morillon; 09 Mar 2026, 10:07.
Working...
X