Which tests are used to calculate the individual P-values in the STATA logit command (binary logistic regression)?

Mads Sandahl

Join Date: Aug 2021

Posts: 9
#1

Which tests are used to calculate the individual P-values in the STATA logit command (binary logistic regression)?

03 Aug 2021, 03:29

This picture is from page 1290 in the STATA manual:

The model can be made with the following code in STATA:

use https://www.stata-press.com/data/r16/auto

keep make mpg weight foreign

logit foreign weight mpg

As I understand the overall fit for the model is calculated with a chi2 test (Prob > chi2 = 0.0000), but how are the individual P-values (P>|z|) calculated?
Tags: logistic regression, logit, p-value
Felix Bittmann

Join Date: Aug 2018

Posts: 683
#2

03 Aug 2021, 03:44

As an example:

Code:

sysuse auto, clear logit foreign weight mpg matrix list r(table) di (1-normal(1.8341116)) * 2 help normal()

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3449
#3

03 Aug 2021, 04:06

Also see this Stata tip: https://www.stata-journal.com/articl...article=st0137

The name of those tests of individual parameters is the Wald test.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Mads Sandahl

Join Date: Aug 2021

Posts: 9
#4

03 Aug 2021, 05:40

Maarten Buis Thank you. Just to be clear - the p-value marked with a red arrow in the image is calculated with a Wald test? If so does it matter if the parameter is binary or continuous?
Comment
Mads Sandahl

Join Date: Aug 2021

Posts: 9
#5

03 Aug 2021, 05:41

Felix Bittmann Thank you for the reply but I'm not sure that I understand it.
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3449
#6

03 Aug 2021, 06:34

True, the p-value you marked with the red arrow is the p-value of a Wald test of the null hypothesis that the coefficient of mpg equals 0.

The parameter is never binary, but I assume you mean whether the explanatory variable is binary or continuous. The answer is no, that makes no difference.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3449
#7

03 Aug 2021, 06:39

Mads: did you execute Felix's code? That is often the best way to learn. Don't be afraid of typing something wrong in Stata; the worst that can happen is that Stata returns an error message. Stata will not turn all cute kittens into brain eating zombie tigers when you type something wrong.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
2 likes
Comment
Mads Sandahl

Join Date: Aug 2021

Posts: 9
#8

03 Aug 2021, 06:40

Maarten Buis exactly, thank you very much.
Comment
Mads Sandahl

Join Date: Aug 2021

Posts: 9
#9

03 Aug 2021, 07:00

Originally posted by Maarten Buis View Post

Mads: did you execute Felix's code? That is often the best way to learn. Don't be afraid of typing something wrong in Stata; the worst that can happen is that Stata returns an error message. Stata will not turn all cute kittens into brain eating zombie tigers when you type something wrong.

Haha fair - I did try to run the code but I don't know what I should do with the result.
Comment

Maarten Buis

Join Date: Mar 2014
Posts: 3449

#10

03 Aug 2021, 09:06

Code:

. sysuse auto, clear
(1978 automobile data)

Opens the example dataset

Code:

. logit foreign weight mpg

Iteration 0:   log likelihood =  -45.03321  
Iteration 1:   log likelihood = -29.238536  
Iteration 2:   log likelihood = -27.244139  
Iteration 3:   log likelihood = -27.175277  
Iteration 4:   log likelihood = -27.175156  
Iteration 5:   log likelihood = -27.175156  

Logistic regression                                     Number of obs =     74
                                                        LR chi2(2)    =  35.72
                                                        Prob > chi2   = 0.0000
Log likelihood = -27.175156                             Pseudo R2     = 0.3966

------------------------------------------------------------------------------
     foreign | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
-------------+----------------------------------------------------------------
      weight |  -.0039067   .0010116    -3.86   0.000    -.0058894    -.001924
         mpg |  -.1685869   .0919175    -1.83   0.067    -.3487418     .011568
       _cons |   13.70837   4.518709     3.03   0.002     4.851859    22.56487
------------------------------------------------------------------------------

Estimates the model

Code:

. matrix list r(table)

r(table)[9,3]
           foreign:    foreign:    foreign:
            weight         mpg       _cons
     b   -.0039067   -.1685869   13.708367
    se   .00101161   .09191747   4.5187094
     z  -3.8618465  -1.8341116   3.0336907
pvalue   .00011253   .06663742   .00241582
    ll  -.00588943  -.34874183   4.8518593
    ul  -.00192397   .01156803   22.564875
    df           .           .           .
  crit    1.959964    1.959964    1.959964
 eform           0           0           0

This displays a matrix stored by the logit command called r(table). Notice that the absolute value of the z-value for the coefficient of mpg is 1.8341116 (3rd row, 2nd column)

Code:

. di (1-normal(1.8341116)) * 2
.06663743

This shows how to transform that absolute value of the z-value (this is the test-statistic for the Wald test) to the p-value. Notice that the p-value we just computed is almost the same as the p-value in r(table)

Code:

. help normal()

This shows the help-file for the normal() function

You asked "but how are the individual P-values (P>|z|) calculated", so Felix showed how that was done.

As a minor note: Stata probably does not do (1-normal(abs(z-value))), but probably normal(-abs(z-value)). Mathematically they are equivalent (the normal distribution is symmetric), but the latter is easier for computers. If we do that we get exactly the same p-value as in r(table)

Code:

. di normal(-abs(_b[mpg]/_se[mpg])) * 2
.06663742

Last edited by Maarten Buis; 03 Aug 2021, 09:15.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------

Comment

Felix Bittmann

Join Date: Aug 2018

Posts: 683
#11

03 Aug 2021, 10:13

Thanks to Maarten for explaining my code in detail. I am sorry I thought this was more about the technical aspects of how Stata does this. What I read from your posts is that it might be beneficial for you to check out some textbooks about the theory of what a p-value means and how it relates to standard errors and normal theory. I am not sure which version is the best explanation but it is quite easy to find some information. For example:

https://online.stat.psu.edu/stat501/lesson/2/2.12
https://www.youtube.com/watch?v=KLnGOL_AUgA

Best wishes

Stata 18.0 MP | ORCID | Google Scholar
1 like
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#12

03 Aug 2021, 13:41

For good measure, z = b/se. For mpg,

Code:

. display -.1685869/ .09191747 -1.8341116

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#13

03 Aug 2021, 13:53

When my daughter was in high school statistics, I tried to explain to her how t values and p values were related. It was one of the most painful experiences in either of our lives.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
5 likes
Comment
Mads Sandahl

Join Date: Aug 2021

Posts: 9
#14

04 Aug 2021, 00:37

Maarten Buis Felix Bittmann Richard Williams , thank you for the answers! My question was indeed "just" which test was used, but some extra information is good.
Comment

Announcement