How to plot an interaction term in multinomial logistic regression model ?

Emerald Chang

Join Date: Sep 2017

Posts: 50
#1

How to plot an interaction term in multinomial logistic regression model ?

21 Jan 2019, 01:02

Dear statalists,

Hope this post finds you well.

I have tried to plot a graph with an interaction term between continuous variable and categorical variable in multinomial logistic regression, despite following steps/instructions suggested on UCLA stata website, I still failed to do so. Have been trying syntax such as margins and marginplot , the plot itself is nevertheless looks odd. Any suggestions on this?

So currently, I am looking at the the association between the level of plasma calcium during trimester 1 and incidence of hypertensive disorders of pregnancy (HDP) in women particularly after 20 weeks of pregnancy in our study, the dependent variable has been divided into

1_Non-hypertensive (reference)
2_Pre-eclampsia
3_Pregnancy Induced Hypertension.

we found that there is an interaction between plasma calcium and ethnicity

ethnicity has been categorised into 3 groups as follows: ethnic group_1(reference), ethnic group_2 and ethnic group _3

Apparently, according to our finding, when there is one-unit increase in plasma calcium, the risk of getting pre-eclampsia is 5% lower specifically in ethnic group 1. The same significant effect of plasma Ca lowering the risk of HDP did not retain when I examined our participants as an entire cohort.

The problem I encountering now is that I have been trying to google syntax to display our findings above in graphs, but somehow those graphs ended up looking weird.

To make it simpler, I started with univariable regression,

My Stata inputs are:

. Mlogit HDP Plasma_Ca i.mo_eth, base (1)

. margins mo_eth, atmeans predict (outcome (1))
. margins mo_eth, atmeans predict (outcome (2))
. margins mo_eth, atmeans predict (outcome (3))

. margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(1))
. margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(2))
. margins, at (Plasma_Ca = (50 (20) 170)) predict(outcome(3))

. predict p1 p2 p3

. sort Plasma_Ca

. twoway (line p1 Plasma_Se if mo_eth ==1) (line p1 Plasma_Se if mo_eth==2) (line p1 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

. twoway (line p2 Plasma_Se if mo_eth ==1) (line p2 Plasma_Se if mo_eth==2) (line p2 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

. twoway (line p3 Plasma_Se if mo_eth ==1) (line p3 Plasma_Se if mo_eth==2) (line p3 Plasma_Se if mo_eth ==3),legend(order(1 "mo_eth = 1" 2 "mo_eth = 2" 3 "mo_eth = 3") ring(0) position(7) row(1))

I am not too sure if those commands above are the right ones to be used, and I wonder if is it still possible for me to apply those "graph plotting commands" in multivariable multinomial logistic regression model while I intend to adjust other factors that is related to my outcome (HDP) and plasma Ca ?

Any comments would be much appreciated.

Many thanks,
Emerald
Tags: None

Maarten Buis

Join Date: Mar 2014
Posts: 3467

21 Jan 2019, 01:45

Code:

// open example data
sysuse nlsw88, clear

// prepare the data
gen byte marst = !never_married + married if !missing(never_married, married)
label variable marst "marital status"
label define marst 0 "never married"    ///
                   1 "widowed/divorced" ///
                   2 "married"
label value marst marst

//estimate model
mlogit marst i.race##c.grade i.south

//====================================================== prepare data for graph
// tells Stata to return to this state of the data wen typing restore
preserve  

// fix any control variables (we won't keep this as we typed preserve)
replace south = 0

// predict the probabilities (while keeping control variables fixed)
predict pr*, pr  

// keep only the variables you want to plot (we won't keep these changes)
keep pr* race grade  

// create an variable that uniquely identifies observations (helps with reshape)
gen id = _n  

// stack the predicted outcomes underneath one another so we can use a by graph
reshape long pr , i(id) j(outcome)

//label the outcome
label define outcome 1 "never married"    ///
                     2 "widowed/divorced" ///
                     3 "married"
label value outcome outcome

// create separate versions pr for the different races, so they can be different lines
separate pr, by(race) veryshortlabel

//make the graph
twoway line pr? grade,                          ///
    by(outcome, legend(at(4) pos(0)) note("") ) ///
    sort ytitle("probability")
    
// get back to the state of the data when we typed preserve    
restore

For fixing additional variables see http://www.maartenbuis.nl/wp/inter_q...ter_quadr.html

For the veryshortlabel option in separate see: Cox, N. J. 2005. Stata tip 27: Classifying data points on scatter plots. Stata Journal
5: 604–606. https://doi.org/10.1177/1536867X0500500412

Last edited by Maarten Buis; 21 Jan 2019, 01:55.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Emerald Chang

Join Date: Sep 2017

Posts: 50
#3

21 Jan 2019, 03:43

Dear Maarten,

Many thanks for the input above. But, may I double check with you what do "pr*" and "pr?" mean in this context specifically? Thank you. I have attached my plot to this post and I wonder if further rescale is recommendable ? Or I just need to leave it as it is. Thank you

Emerald
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3467
#4

21 Jan 2019, 04:10

predict pr*, pr means predict the predicted probabilities for all three outcomes and store them in pr1, pr2, and pr3. If we had four outcomes, it would predict the probabilities for all four outcomes and put them in pr1, pr2, pr3, and pr4.

pr? is shorthand for all variable starting with pr and one other character. I used separate before to create pr1, pr2, and pr3 for the different races, so those three variables are what will be captured by that shorthand.

That graph looks to me like a good candidate for a logit scale. See: Nicholas J. Cox, 2008. "Stata tip 59: Plotting on any transformed scale," Stata Journal, 8(1):142-145. https://doi.org/10.1177/1536867X0800800113

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment

Announcement

How to plot an interaction term in multinomial logistic regression model ?

Comment

Comment

Comment