Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A problem with interaction in mlogit and marginal effects at means

    Hey all, thanks beforehand.

    I'm using MEM and getting a weird result, so I think I'm doing something wrong. The Stata version is 14.2 on Windows 10.

    I'm using a dataset of 32 variables and 800,000~ observations. 3 Variables are used in the question below. An example:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float action_num byte(officer_race_num driver_race_num)
    2 1 1
    2 0 0
    2 0 0
    2 0 0
    1 0 1
    2 0 0
    2 0 0
    1 2 2
    2 1 1
    1 0 1
    2 0 0
    3 1 1
    1 0 0
    1 0 2
    1 0 1
    1 1 1
    2 0 0
    1 0 0
    2 0 0
    1 0 1
    end
    With this, I'm using a mlogit regression where my dependent variable is action_num (categorical (3)) and my independent variables are officer_race_num and driver_race_num (both categorical (3 and 4 categories respectively) and their interactions, like so:

    Code:
     mlogit action_num i.officer_race_num i.driver_race_num i.officer_race_num#i.driver_race_num i.year, r base(1)
    And at last I run margins to get the MEM of the above regression, like so:

    Code:
     margins, dydx (i.officer_race_num) atmeans post
    The weird result is: If create the interaction variable manually, run the mlogit with the manual variable instead of using the Stata command for interaction variable ("#"), and at last post the MEM, I get a different result then the MEM estimation with the Stata induced interaction ("#").

    Meaning, the change is I changed the above mlogit command to:

    Code:
    gen off_drv_race = 0
    replace off_drv_race = 1 if officer_race_num==1 & driver_race_num==1
    replace off_drv_race = 2 if officer_race_num==1 & driver_race_num==2
    replace off_drv_race = 3 if officer_race_num==1 & driver_race_num==3
    replace off_drv_race = 4 if officer_race_num==2 & driver_race_num==1
    replace off_drv_race = 5 if officer_race_num==2 & driver_race_num==2
    replace off_drv_race = 6 if officer_race_num==2 & driver_race_num==3
    mlogit action_num i.officer_race_num i.driver_race_num i.off_drv_race i.year, r base(1)
    This produces exactly the same results as the first mlogit, and when I use the same margins command, I get different results.

    The problem: I want to post the margins for the interactions, and to do this I need to use the manual interaction since margins can't intake the Stata induced interaction ("#"), but I see it changes the initial results. So I think maybe I'm making some mistake along the way.

    I checked if the mlogit results were different, but they are the same with the manual interaction and with the Stata induced interaction ("#").

    So my main question would be, why is there a difference? And what does this difference mean? What am I doing wrong?

    Thanks alot,
    Dor.



  • #2
    Originally posted by Dor Leventer View Post
    So my main question would be, why is there a difference? And what does this difference mean? What am I doing wrong?
    The short answer is: the difference means that one set of results is correct and the other is not; not using factor variable notation but creating the interaction term "by hand" is a mistake.

    What happens is that, in order to get results (predictions) right, margins needs to know which variables belong together. You tell margins which variables belong together by using factor variable notation. When you type

    Code:
    ... i.officer_race#i.driver_race_num
    you tell margins that (and how) this term is made up of two variables; if you do not, margins treats the interaction term (off_drv_race) as an additional variable.

    I do not completely understand what you are ultimately trying to do, but note that margins is perfectly happy with statements such as

    Code:
    margins officer_race_num#driver_race_num
    Best
    Daniel

    Comment


    • #3
      Hey Daniel, thanks for helping.

      When I try using the code of the not "by hand interaction" in the margins command I get an error:

      Code:
      margins, dydx (i.officer_race_num##i.driver_race_num) atmeans post
      invalid dydx() option;
      levels of interactions not allowed
      By the way, I don't understand why treatment as an additional variable is wrong, since econometric wise, this is an interaction variable, thus by definition separate from the other categorical variables.

      Comment


      • #4
        I am away from the PC today. Read Vince Wiggins' post for an explanation why there might not be an average marginal effect of an interaction term; it also explains that an interaction is not an additional/separate variable of its own.

        Best
        Daniel

        ​​

        Comment


        • #5
          I haven't looked carefully at your specific problem but this handout explains why you should NOT compute interaction terms yourself if you want to use margins.

          https://www3.nd.edu/~rwilliam/xsoc73994/Margins01.pdf
          -------------------------------------------
          Richard Williams, Notre Dame Dept of Sociology
          Stata Version: 17.0 MP (2 processor)

          EMAIL: [email protected]
          WWW: https://www3.nd.edu/~rwilliam

          Comment


          • #6
            Thanks a lot everyone, problem solved.

            Comment

            Working...
            X