Manually producing probabilites after logit

Dario Maimone Ansaldo Patti

Join Date: Aug 2014

Posts: 505
#1

Manually producing probabilites after logit

29 Oct 2021, 17:36

Dear All,

I estimate a logit model and then I need to calculate Prob[Y=1|X], where X is my set of regressors. Obviously, I can use:

Code:

predict probabilities

after the logit estimation. However, for some specific reasons I need to compute those probabilities "manually". Hence, I have:

Code:

webuse auto.dta, replace gen dummy=0 replace dummy=1 if price>9500 rename logit dummy logit dummy mpg trunk length gen probabilities2=1/(1+exp(-(_b[mpg]*mpg+_b[trunk]*trunk+_b[length]*length+_b[_cons]*_cons)))

The above will give me the same results, as if I use the postestimation command predict probabilities. However, rather then writing down the full list of estimated parameters and regressors, I would like to find out a short way. My attempt was:

Code:

logit dummy mpg trunk length matrix bmatrix=e(b) mat accum m = mpg trunk length matrix xb=bmatrix*m gen probabilities=1/(1+exp(-(xb)))

My doubts: 1) is the way I tried to generate xb correct? I am not sure about it. If not, how I can modify the calculation? 2) When I enter xb in the last line, I get the error message:

[CODE}
matrix operators that return matrices not allowed in this context
[/CODE]

How can I include the generated values xb?

Thanks in advance for your help.

Dario
Tags: logit, matrix accum
Clyde Schechter

Join Date: Apr 2014

Posts: 30002
#2

29 Oct 2021, 18:51

Code:

matrix b = e(b) matrix score double xb = b gen p = invlogit(xb)

Your approach, which involves calculate sums of squares and products from e(b) is misguided. The above is, I think, the simplest way in Stata to calculate the predicted probabilities "by hand."
Comment
Dario Maimone Ansaldo Patti

Join Date: Aug 2014

Posts: 505
#3

29 Oct 2021, 19:42

Clyde Schechter thanks for your suggestion. The reason why I was trying to calculate xb in that way is that actually, when calculating it, i need to remove one or more coefficients/regressors. For instance, suppose i run:

Code:

logit y X1 X2 X3 X4

Then when I calculate the probabilities, i need to remove, say X1, i.e. beta1*X1, and I should include only _b[X3]. I know that the result *will not* PR[y=1|X], but this is what I need to do. Regarding the first issue, I guess that after creating the vector of coefficients, I should remove "manually" the parameter associated to X1. But what about the second issue? I think that typing something like:

Code:

gen p = invlogit (xb+_b[X3])

Would produce an error. Am i correct?

Thanks

Dario
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4972
#4

29 Oct 2021, 20:16

I don’t have access to Stata right now, but I think you can do this with margins and the generate option:

https://www.stata.com/help.cgi?margins_generate

Use the at option to set the vars you want dropped to equal 0. Something like

margins, at(x1 = 0) gen(pred01)

i am doing this on my iPad but hopefully this is close!

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
1 like
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30002
#5

29 Oct 2021, 21:22

As noted by Richard Williams, there are several different approaches to this. And I think the approach with -margins- is a better one. It is usually not a good idea to calculate manually what Stata can calculate for you. I though you were doing this for some sort of learning/demonstration/teaching purpose. Had I understood you wanted this so you could produce a different statistic, I, too, would have referred you to -margins-

That said, if you were simply trying to demonstrate how predicted probabilities can be calculated manually, within the framework of #2, it is easy enough to exclude the X3 term:

Code:

matrix b = e(b) matrix score double xb = b gen p = invlogit(xb -_b[X3]*X3)

Moral of the story: if you want to find out how to do X, it is best to ask directly about that, rather than asking how to do Y because you have started down a path toward X that begins with Y. As here, the best path to X may not go through Y. In fact, sometimes there is no path to Y through X.
Comment
Dario Maimone Ansaldo Patti

Join Date: Aug 2014

Posts: 505
#6

30 Oct 2021, 01:22

Clyde Schechter and Richard Williams thanks lot for your useful insights. Your explanations simplyfy my life a lot.
Comment

Announcement

Manually producing probabilites after logit

Comment

Comment

Comment

Comment

Comment