Wald Test with binary predictor - does this make sense?

Manuel Grimeisen

Join Date: Nov 2019

Posts: 8
#1

Wald Test with binary predictor - does this make sense?

03 Dec 2019, 03:14

Hi there,

to evaluate my logistic regression model I am planing to perform a wald test. Besides my control variables, my dependent variable is binary (success) and my independent variable (gender - 0 or 1) is binary as well.
When I am doing my logit regression beforehand of the wald test (test gender) I can not set the gender variable as categorical (i.), otherwise I am getting the error message "variable gender not found" when performing the test gender.
I am wondering whether it makes sense at all to perform the wald test with a binary variable, which takes the value of 0 (for male) in my general model, because all the wald test is doing is setting the variable gender to 0 - which would be male in my case?

Would be very glad if you can answer this question.

Thanks for the help.
Best,
Manuel
Tags: None
Joseph Coveney

Join Date: Apr 2014

Posts: 4401
#2

03 Dec 2019, 04:12

Originally posted by Manuel Grimeisen View Post

When I am doing my logit regression beforehand of the wald test (test gender) I can not set the gender variable as categorical (i.), otherwise I am getting the error message "variable gender not found" when performing the test gender.

Works for me.

.ÿ
.ÿversionÿ16.0

.ÿ
.ÿclearÿ*

.ÿ
.ÿsetÿseedÿ`=strreverse("1527266")'

.ÿquietlyÿsetÿobsÿ200

.ÿ
.ÿgenerateÿbyteÿoutÿ=ÿruniform()ÿ<ÿ0.5

.ÿgenerateÿbyteÿsexÿ=ÿruniform()ÿ<ÿ0.5

.ÿgenerateÿdoubleÿctlÿ=ÿruniform()

.ÿ
.ÿ*
.ÿ*ÿBeginÿhere
.ÿ*
.ÿlogitÿoutÿi.sexÿc.ctl,ÿnolog

LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ200
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(2)ÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ0.19
ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.9104
Logÿlikelihoodÿ=ÿÿ-138.4456ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0007

------------------------------------------------------------------------------
ÿÿÿÿÿÿÿÿÿoutÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
-------------+----------------------------------------------------------------
ÿÿÿÿÿÿÿ1.sexÿ|ÿÿÿÿ.122393ÿÿÿ.2840221ÿÿÿÿÿ0.43ÿÿÿ0.667ÿÿÿÿ-.4342801ÿÿÿÿ.6790662
ÿÿÿÿÿÿÿÿÿctlÿ|ÿÿÿÿ.004629ÿÿÿ.4861549ÿÿÿÿÿ0.01ÿÿÿ0.992ÿÿÿÿÿ-.948217ÿÿÿÿÿ.957475
ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.0021895ÿÿÿÿ.303472ÿÿÿÿ-0.01ÿÿÿ0.994ÿÿÿÿ-.5969837ÿÿÿÿ.5926046
------------------------------------------------------------------------------

.ÿ
.ÿtestÿ1.sex

ÿ(ÿ1)ÿÿ[out]1.sexÿ=ÿ0

ÿÿÿÿÿÿÿÿÿÿÿchi2(ÿÿ1)ÿ=ÿÿÿÿ0.19
ÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.6665

.ÿ
.ÿexit

endÿofÿdo-file

.
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17704

03 Dec 2019, 04:17

Manuel:
as per FAQ, reporting what you typed and what Stata gave you back within CODE delimiters (instead of devoting your precious time to explain in words what occurred in your quantitative experience) can help enormously interested listers in helping you out in turn.
All that said, I ho hope that what folows can give you some clues on what happened with your estimate:

Code:

. use http://www.stata-press.com/data/r15/auto.dta
(1978 Automobile Data)

. logit foreign i.rep78

note: 1.rep78 != 0 predicts failure perfectly
      1.rep78 dropped and 2 obs not used

note: 2.rep78 != 0 predicts failure perfectly
      2.rep78 dropped and 8 obs not used

note: 5.rep78 omitted because of collinearity
Iteration 0:   log likelihood = -38.411464 
Iteration 1:   log likelihood = -27.676628 
Iteration 2:   log likelihood = -27.446054 
Iteration 3:   log likelihood = -27.444671 
Iteration 4:   log likelihood = -27.444671 

Logistic regression                             Number of obs     =         59
                                                LR chi2(2)        =      21.93
                                                Prob > chi2       =     0.0000
Log likelihood = -27.444671                     Pseudo R2         =     0.2855

------------------------------------------------------------------------------
     foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       rep78 |
          1  |          0  (empty)
          2  |          0  (empty)
          3  |  -3.701302   .9906975    -3.74   0.000    -5.643033   -1.759571
          4  |  -1.504077   .9128709    -1.65   0.099    -3.293271    .2851168
          5  |          0  (omitted)
             |
       _cons |   1.504077    .781736     1.92   0.054    -.0280969    3.036252
------------------------------------------------------------------------------

. testparm(i.rep78)

 ( 1)  [foreign]3.rep78 = 0
 ( 2)  [foreign]4.rep78 = 0

           chi2(  2) =   15.37
         Prob > chi2 =    0.0005

. testparm(i2.rep78)
no such variables;
the specified varlist does not identify any testable coefficients
r(111);

.

Kind regards,
Carlo
(Stata 19.0)

Comment

Manuel Grimeisen

Join Date: Nov 2019

Posts: 8
#4

03 Dec 2019, 04:21

Thanks for your reply. Do I interpret your reply right, that it makes sense to perform the test? Can you maybe briefly explain why it does make sense even though my binary variable gender is also happen to be 0 in my logit model?
Thanks!
Best, Manuel
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#5

03 Dec 2019, 04:44

Manuel:
Again, Stata code and results would have outperformed your word description.
What fdo you mean by

binary variable gender is also happen to be 0

?
My gut-feeling is that the level 0 of your categorical variable is omitted to avoid the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
But this is guess-work only...

Kind regards,
Carlo
(Stata 19.0)
Comment
Manuel Grimeisen

Join Date: Nov 2019

Posts: 8
#6

03 Dec 2019, 05:34

Hey Carlo,
sorry, I did not refresh the page, so I did not read your reply. So here is my code:

Code:

quietly: logit i.gender success i.country i.updates c.previous_success test (i.gender)

When looking at the code of Josef he set gender to 1 in the test

Code:

test 1.gender

Maybe this is the solutions!?
When I am using your suggestion

Code:

testparm(i.gender)

I am getting the error message "depvar may not be a factor variable".

Thanks for your help!!
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#7

03 Dec 2019, 07:20

Manuel:
you cannot -test- the regressand, but predictors only. End of the story and no fix available.
By the way: your regressand is -success- or -gender-?

Last edited by Carlo Lazzaro; 03 Dec 2019, 07:28.

Kind regards,
Carlo
(Stata 19.0)
Comment
Manuel Grimeisen

Join Date: Nov 2019

Posts: 8
#8

03 Dec 2019, 07:26

Sorry, my fault:

Code:

logit success i.gender i.country i.updates c.previous_success test (i.gender)

So success is the dependent variable and gender the predictor.
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2149
#9

03 Dec 2019, 07:27

You have i.gender and success in the wrong order.

The Wald test is just the t statistic on gender.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17704
#10

03 Dec 2019, 07:29

Manuel:
thanks for clarifying.
Now, what happens when you type:

Code:

testparm(i.gender)

Kind regards,
Carlo
(Stata 19.0)
Comment

Richard Williams

Join Date: Apr 2014
Posts: 4983

#11

03 Dec 2019, 07:36

When working with categorical independent variables, I too am a big fan of the testparm command. I don't need to know what the category numbers are or be worried about how many categories a variable has. Just refer to the categorical variables the same way as you did in your estimation command.

Code:

. webuse nhanes2f, clear

. logit diabetes height weight i.female i.race, nolog

Logistic regression                             Number of obs     =     10,335
                                                LR chi2(5)        =     149.63
                                                Prob > chi2       =     0.0000
Log likelihood = -1924.2517                     Pseudo R2         =     0.0374

------------------------------------------------------------------------------
    diabetes |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      height |  -.0580428   .0070647    -8.22   0.000    -.0718894   -.0441961
      weight |   .0289802   .0028792    10.07   0.000     .0233372    .0346233
    1.female |  -.3262659   .1307236    -2.50   0.013    -.5824795   -.0700524
             |
        race |
      Black  |   .4918164   .1262475     3.90   0.000     .2443758    .7392569
      Other  |  -.0628163   .3488428    -0.18   0.857    -.7465356    .6209029
             |
       _cons |   4.661086   1.176063     3.96   0.000     2.356044    6.966127
------------------------------------------------------------------------------

. testparm i.female

 ( 1)  [diabetes]1.female = 0

           chi2(  1) =    6.23
         Prob > chi2 =    0.0126

. testparm i.race

 ( 1)  [diabetes]2.race = 0
 ( 2)  [diabetes]3.race = 0

           chi2(  2) =   15.32
         Prob > chi2 =    0.0005

. testparm i.female i.race

 ( 1)  [diabetes]1.female = 0
 ( 2)  [diabetes]2.race = 0
 ( 3)  [diabetes]3.race = 0

           chi2(  3) =   20.60
         Prob > chi2 =    0.0001

.

Incidentally if it is just a dummy variable you are interested in all you have to do is look at the Z value for it. You don't need an additional Wald test. The Wald test will be the Z value squared and the two-tailed P values will be the same.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam

Comment

Manuel Grimeisen

Join Date: Nov 2019

Posts: 8
#12

03 Dec 2019, 07:39

Thanks every one very much for your excellent support! With testparm it now works.

Thanks!
Best, Manuel
Comment
Eric de Souza

Join Date: Mar 2014

Posts: 587
#13

03 Dec 2019, 07:54

This has always intrigued me: when to use test and when to testparm. I always try test first, and, if I get an error message, I switch to testparm.
Comment
Richard Williams

Join Date: Apr 2014

Posts: 4983
#14

03 Dec 2019, 08:42

Test is more flexible. It has a lot of options, like doing Bonferroni adjustments. You can do things like

test x1 = 3

But, if you just want to test whether coefficients = 0, I find that testparm is usually easier.

-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
StataNow Version: 19.5 MP (2 processor)
EMAIL: [email protected]
WWW: https://www3.nd.edu/~rwilliam
Comment

Announcement