Predicted probabilities based on simulated success rates of categorical predictor for logit models

Daniel Byrd

Join Date: Oct 2018

Posts: 4
#1

Predicted probabilities based on simulated success rates of categorical predictor for logit models

31 Oct 2018, 05:28

Hi all,

I ran a basic logit model and the standardized coef showed that a dichotomies variable is the strongest predictor. Next, I ran descriptives and found that only 40% of people were answered yes to that item. I want to see what would happen to my DV success rate if the success of the predictor was increased from 40% to 60%. I know how to predict success at different rates of an IV when the variable is continuous, but I don't know how to do it when I'm predicting simulated success of a categorical predictor. Any thoughts?

Thanks again

Dan
Tags: None
Matt Warkentin

Join Date: May 2016

Posts: 104
#2

31 Oct 2018, 09:11

Hi Daniel,

I am a little confused by your question. With a continuous predictor, you can make predictions for the outcome probability based on an plausible value of the predictor. However, for a binary variable, individuals can only be 0 or 1. There is no such thing as 40% or 60% of a binary predictor (centering variables aside). A predictor success rate of 40 or 60% is a sample metric, not an individual metric. So what you are asking, I think, is how many more events you might have if the sample had the predictor==1 for 60% of participants, instead of 40%. I have tried to achieve this below. Hope this is a helpful starting point.

I wrote a very crude simulation to show one such way to play around with the proportion of success for your binary predictor, and to see it's effect on the number of events in your outcome. Here is the code:

Code:

* Choose number of observation capture clear set obs 1000 * Simulate your binary predictor gen x1 = rbinomial(1, 0.4) * Log-odds of the outcome based on chosen effect size gen logit = 0.5 + 2*x1 * Apply inverse-logit (expit) to get probabilities gen prob = exp(logit) / (1 + exp(logit)) * Use probabilities to simulate binary outcome gen y = rbinomial(1, prob) tab1 y logit y x1, nolog or

You can then run this again and change the probability of success for x1 from 0.4 to 0.6 (or any other number between 0 and 1). You could also add in other covariates and include non-linear effects or higher-order interactions, but this is a very simple demonstration. You may also consider setting the RNG seed to achieve reproducibility. I chose the effect size to be 2.0 on the logit scale, which is a very large effect (OR approximately 7.4), you can replace this with the coefficient from your own data/model.
Comment
Daniel Byrd

Join Date: Oct 2018

Posts: 4
#3

31 Oct 2018, 09:21

Thank you Matt, this is just what I was looking for!
Comment
Daniel Byrd

Join Date: Oct 2018

Posts: 4
#4

03 Nov 2018, 14:08

My only issue is that how do I make this distribution match the distribution of my predictor variable?
Comment
Daniel Byrd

Join Date: Oct 2018

Posts: 4
#5

03 Nov 2018, 14:47

I tried following the fixed correlation thread but that code didn't work
Comment
Matt Warkentin

Join Date: May 2016

Posts: 104
#6

05 Nov 2018, 08:35

Hi Daniel,

You want the distribution of the simulated predictor to follow the distribution of your observed predictor? Is this correct? In general, to do Monte Carlo simulations we need to specify a probability distribution to sample from and so you can select one that follows closely to your observed distribution (or theoretical distribution). There is room to play around with different distributions and you may parameterize them as you like. Something like Empirical Bayesian approaches could get around this, but I don't know that you wan to go there.

As for your second question, what fixed correlation thread are you referring to? I am unclear on what you mean.
Comment

Announcement

Predicted probabilities based on simulated success rates of categorical predictor for logit models

Comment

Comment

Comment

Comment

Comment