Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to plot an interaction term in multinomial logistic regression model ?

    Dear statalists,

    Hope this post finds you well.


    I have tried to plot a graph with an interaction term between continuous variable and categorical variable in multinomial logistic regression, despite following steps/instructions suggested on UCLA stata website, I still failed to do so. Have been trying syntax such as margins and marginplot , the plot itself is nevertheless looks odd. Any suggestions on this?

    So currently, I am looking at the the association between the level of plasma calcium during trimester 1 and incidence of hypertensive disorders of pregnancy (HDP) in women particularly after 20 weeks of pregnancy in our study, the dependent variable has been divided into

    1_Non-hypertensive (reference)
    2_Pre-eclampsia
    3_Pregnancy Induced Hypertension.

    we found that there is an interaction between plasma calcium and ethnicity

    ethnicity has been categorised into 3 groups as follows: ethnic group_1(reference), ethnic group_2 and ethnic group _3

    Apparently, according to our finding, when there is one-unit increase in plasma calcium, the risk of getting pre-eclampsia is 5% lower specifically in ethnic group 1. The same significant effect of plasma Ca lowering the risk of HDP did not retain when I examined our participants as an entire cohort.

    The problem I encountering now is that I have been trying to google syntax to display our findings above in graphs, but somehow those graphs ended up looking weird.

    To make it simpler, I started with univariable regression,

    My Stata inputs are:

    . Mlogit HDP Plasma_Ca i.mo_eth, base (1)

    . margins mo_eth, atmeans predict (outcome (1))
    . margins mo_eth, atmeans predict (outcome (2))
    . margins mo_eth, atmeans predict (outcome (3))

    . margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(1))
    . margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(2))
    . margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(3))

    . predict p1 p2 p3

    . sort Plasma_Ca

    . twoway (line p1 Plasma_Se if mo_eth ==1) (line p1 Plasma_Se if mo_eth==2) (line p1 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

    . twoway (line p2 Plasma_Se if mo_eth ==1) (line p2 Plasma_Se if mo_eth==2) (line p2 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

    . twoway (line p3 Plasma_Se if mo_eth ==1) (line p3 Plasma_Se if mo_eth==2) (line p3 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

    I am not too sure if those commands above are the right ones to be used, and I wonder if is it still possible for me to apply those "graph plotting commands" in multivariable multinomial logistic regression model while I intend to adjust other factors that is related to my outcome (HDP) and plasma Ca ?

    Any comments would be much appreciated.

    Many thanks,
    Emerald

  • #2
    Code:
    // open example data
    sysuse nlsw88, clear
    
    // prepare the data
    gen byte marst = !never_married + married if !missing(never_married, married)
    label variable marst "marital status"
    label define marst 0 "never married"    ///
                       1 "widowed/divorced" ///
                       2 "married"
    label value marst marst
    
    //estimate model
    mlogit marst i.race##c.grade i.south
    
    //====================================================== prepare data for graph
    // tells Stata to return to this state of the data wen typing restore
    preserve  
    
    // fix any control variables (we won't keep this as we typed preserve)
    replace south = 0
    
    // predict the probabilities (while keeping control variables fixed)
    predict pr*, pr  
    
    // keep only the variables you want to plot (we won't keep these changes)
    keep pr* race grade  
    
    // create an variable that uniquely identifies observations (helps with reshape)
    gen id = _n  
    
    // stack the predicted outcomes underneath one another so we can use a by graph
    reshape long pr , i(id) j(outcome)
    
    //label the outcome
    label define outcome 1 "never married"    ///
                         2 "widowed/divorced" ///
                         3 "married"
    label value outcome outcome
    
    // create separate versions pr for the different races, so they can be different lines
    separate pr, by(race) veryshortlabel
    
    //make the graph
    twoway line pr? grade,                          ///
        by(outcome, legend(at(4) pos(0)) note("") ) ///
        sort ytitle("probability")
        
    // get back to the state of the data when we typed preserve    
    restore
    For fixing additional variables see http://www.maartenbuis.nl/wp/inter_q...ter_quadr.html

    For the veryshortlabel option in separate see: Cox, N. J. 2005. Stata tip 27: Classifying data points on scatter plots. Stata Journal
    5: 604–606. https://doi.org/10.1177/1536867X0500500412
    Last edited by Maarten Buis; 21 Jan 2019, 01:55.
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Dear Maarten,

      Many thanks for the input above. But, may I double check with you what do "pr*" and "pr?" mean in this context specifically? Thank you. I have attached my plot to this post and I wonder if further rescale is recommendable ? Or I just need to leave it as it is. Thank you


      Graph_interaction plot.gif

      Emerald

      Comment


      • #4
        predict pr*, pr means predict the predicted probabilities for all three outcomes and store them in pr1, pr2, and pr3. If we had four outcomes, it would predict the probabilities for all four outcomes and put them in pr1, pr2, pr3, and pr4.

        pr? is shorthand for all variable starting with pr and one other character. I used separate before to create pr1, pr2, and pr3 for the different races, so those three variables are what will be captured by that shorthand.

        That graph looks to me like a good candidate for a logit scale. See: Nicholas J. Cox, 2008. "Stata tip 59: Plotting on any transformed scale," Stata Journal, 8(1):142-145. https://doi.org/10.1177/1536867X0800800113
        ---------------------------------
        Maarten L. Buis
        University of Konstanz
        Department of history and sociology
        box 40
        78457 Konstanz
        Germany
        http://www.maartenbuis.nl
        ---------------------------------

        Comment

        Working...
        X