Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Wald Test with binary predictor - does this make sense?

    Hi there,

    to evaluate my logistic regression model I am planing to perform a wald test. Besides my control variables, my dependent variable is binary (success) and my independent variable (gender - 0 or 1) is binary as well.
    When I am doing my logit regression beforehand of the wald test (test gender) I can not set the gender variable as categorical (i.), otherwise I am getting the error message "variable gender not found" when performing the test gender.
    I am wondering whether it makes sense at all to perform the wald test with a binary variable, which takes the value of 0 (for male) in my general model, because all the wald test is doing is setting the variable gender to 0 - which would be male in my case?

    Would be very glad if you can answer this question.

    Thanks for the help.
    Best,
    Manuel




  • #2
    Originally posted by Manuel Grimeisen View Post
    When I am doing my logit regression beforehand of the wald test (test gender) I can not set the gender variable as categorical (i.), otherwise I am getting the error message "variable gender not found" when performing the test gender.
    Works for me.

    .ÿ
    .ÿversionÿ16.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿsetÿseedÿ`=strreverse("1527266")'

    .ÿquietlyÿsetÿobsÿ200

    .ÿ
    .ÿgenerateÿbyteÿoutÿ=ÿruniform()ÿ<ÿ0.5

    .ÿgenerateÿbyteÿsexÿ=ÿruniform()ÿ<ÿ0.5

    .ÿgenerateÿdoubleÿctlÿ=ÿruniform()

    .ÿ
    .ÿ*
    .ÿ*ÿBeginÿhere
    .ÿ*
    .ÿlogitÿoutÿi.sexÿc.ctl,ÿnolog

    LogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿÿÿ200
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿLRÿchi2(2)ÿÿÿÿÿÿÿÿ=ÿÿÿÿÿÿÿ0.19
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.9104
    Logÿlikelihoodÿ=ÿÿ-138.4456ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0007

    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿoutÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
    -------------+----------------------------------------------------------------
    ÿÿÿÿÿÿÿ1.sexÿ|ÿÿÿÿ.122393ÿÿÿ.2840221ÿÿÿÿÿ0.43ÿÿÿ0.667ÿÿÿÿ-.4342801ÿÿÿÿ.6790662
    ÿÿÿÿÿÿÿÿÿctlÿ|ÿÿÿÿ.004629ÿÿÿ.4861549ÿÿÿÿÿ0.01ÿÿÿ0.992ÿÿÿÿÿ-.948217ÿÿÿÿÿ.957475
    ÿÿÿÿÿÿÿ_consÿ|ÿÿ-.0021895ÿÿÿÿ.303472ÿÿÿÿ-0.01ÿÿÿ0.994ÿÿÿÿ-.5969837ÿÿÿÿ.5926046
    ------------------------------------------------------------------------------

    .ÿ
    .ÿtestÿ1.sex

    ÿ(ÿ1)ÿÿ[out]1.sexÿ=ÿ0

    ÿÿÿÿÿÿÿÿÿÿÿchi2(ÿÿ1)ÿ=ÿÿÿÿ0.19
    ÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.6665

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    Comment


    • #3
      Manuel:
      as per FAQ, reporting what you typed and what Stata gave you back within CODE delimiters (instead of devoting your precious time to explain in words what occurred in your quantitative experience) can help enormously interested listers in helping you out in turn.
      All that said, I ho hope that what folows can give you some clues on what happened with your estimate:
      Code:
      . use http://www.stata-press.com/data/r15/auto.dta
      (1978 Automobile Data)
      
      . logit foreign i.rep78
      
      note: 1.rep78 != 0 predicts failure perfectly
            1.rep78 dropped and 2 obs not used
      
      note: 2.rep78 != 0 predicts failure perfectly
            2.rep78 dropped and 8 obs not used
      
      note: 5.rep78 omitted because of collinearity
      Iteration 0:   log likelihood = -38.411464 
      Iteration 1:   log likelihood = -27.676628 
      Iteration 2:   log likelihood = -27.446054 
      Iteration 3:   log likelihood = -27.444671 
      Iteration 4:   log likelihood = -27.444671 
      
      Logistic regression                             Number of obs     =         59
                                                      LR chi2(2)        =      21.93
                                                      Prob > chi2       =     0.0000
      Log likelihood = -27.444671                     Pseudo R2         =     0.2855
      
      ------------------------------------------------------------------------------
           foreign |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      -------------+----------------------------------------------------------------
             rep78 |
                1  |          0  (empty)
                2  |          0  (empty)
                3  |  -3.701302   .9906975    -3.74   0.000    -5.643033   -1.759571
                4  |  -1.504077   .9128709    -1.65   0.099    -3.293271    .2851168
                5  |          0  (omitted)
                   |
             _cons |   1.504077    .781736     1.92   0.054    -.0280969    3.036252
      ------------------------------------------------------------------------------
      
      . testparm(i.rep78)
      
       ( 1)  [foreign]3.rep78 = 0
       ( 2)  [foreign]4.rep78 = 0
      
                 chi2(  2) =   15.37
               Prob > chi2 =    0.0005
      
      . testparm(i2.rep78)
      no such variables;
      the specified varlist does not identify any testable coefficients
      r(111);
      
      .
      Kind regards,
      Carlo
      (Stata 19.0)

      Comment


      • #4
        Thanks for your reply. Do I interpret your reply right, that it makes sense to perform the test? Can you maybe briefly explain why it does make sense even though my binary variable gender is also happen to be 0 in my logit model?
        Thanks!
        Best, Manuel

        Comment


        • #5
          Manuel:
          Again, Stata code and results would have outperformed your word description.
          What fdo you mean by
          binary variable gender is also happen to be 0
          ?
          My gut-feeling is that the level 0 of your categorical variable is omitted to avoid the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
          But this is guess-work only...
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            Hey Carlo,
            sorry, I did not refresh the page, so I did not read your reply. So here is my code:
            Code:
             quietly: logit i.gender success i.country i.updates c.previous_success
            test (i.gender)
            When looking at the code of Josef he set gender to 1 in the test
            Code:
             test 1.gender
            Maybe this is the solutions!?
            When I am using your suggestion
            Code:
             testparm(i.gender)
            I am getting the error message "depvar may not be a factor variable".

            Thanks for your help!!

            Comment


            • #7
              Manuel:
              you cannot -test- the regressand, but predictors only. End of the story and no fix available.
              By the way: your regressand is -success- or -gender-?
              Last edited by Carlo Lazzaro; 03 Dec 2019, 07:28.
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Sorry, my fault:

                Code:
                 logit success i.gender i.country i.updates c.previous_success
                test (i.gender)
                So success is the dependent variable and gender the predictor.

                Comment


                • #9
                  You have i.gender and success in the wrong order.

                  The Wald test is just the t statistic on gender.

                  Comment


                  • #10
                    Manuel:
                    thanks for clarifying.
                    Now, what happens when you type:
                    Code:
                    testparm(i.gender)
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment


                    • #11
                      When working with categorical independent variables, I too am a big fan of the testparm command. I don't need to know what the category numbers are or be worried about how many categories a variable has. Just refer to the categorical variables the same way as you did in your estimation command.

                      Code:
                      . webuse nhanes2f, clear
                      
                      . logit diabetes height weight i.female i.race, nolog
                      
                      Logistic regression                             Number of obs     =     10,335
                                                                      LR chi2(5)        =     149.63
                                                                      Prob > chi2       =     0.0000
                      Log likelihood = -1924.2517                     Pseudo R2         =     0.0374
                      
                      ------------------------------------------------------------------------------
                          diabetes |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                      -------------+----------------------------------------------------------------
                            height |  -.0580428   .0070647    -8.22   0.000    -.0718894   -.0441961
                            weight |   .0289802   .0028792    10.07   0.000     .0233372    .0346233
                          1.female |  -.3262659   .1307236    -2.50   0.013    -.5824795   -.0700524
                                   |
                              race |
                            Black  |   .4918164   .1262475     3.90   0.000     .2443758    .7392569
                            Other  |  -.0628163   .3488428    -0.18   0.857    -.7465356    .6209029
                                   |
                             _cons |   4.661086   1.176063     3.96   0.000     2.356044    6.966127
                      ------------------------------------------------------------------------------
                      
                      . testparm i.female
                      
                       ( 1)  [diabetes]1.female = 0
                      
                                 chi2(  1) =    6.23
                               Prob > chi2 =    0.0126
                      
                      . testparm i.race
                      
                       ( 1)  [diabetes]2.race = 0
                       ( 2)  [diabetes]3.race = 0
                      
                                 chi2(  2) =   15.32
                               Prob > chi2 =    0.0005
                      
                      . testparm i.female i.race
                      
                       ( 1)  [diabetes]1.female = 0
                       ( 2)  [diabetes]2.race = 0
                       ( 3)  [diabetes]3.race = 0
                      
                                 chi2(  3) =   20.60
                               Prob > chi2 =    0.0001
                      
                      .
                      Incidentally if it is just a dummy variable you are interested in all you have to do is look at the Z value for it. You don't need an additional Wald test. The Wald test will be the Z value squared and the two-tailed P values will be the same.
                      -------------------------------------------
                      Richard Williams, Notre Dame Dept of Sociology
                      StataNow Version: 19.5 MP (2 processor)

                      EMAIL: [email protected]
                      WWW: https://www3.nd.edu/~rwilliam

                      Comment


                      • #12
                        Thanks every one very much for your excellent support! With testparm it now works.

                        Thanks!
                        Best, Manuel

                        Comment


                        • #13
                          This has always intrigued me: when to use test and when to testparm. I always try test first, and, if I get an error message, I switch to testparm.

                          Comment


                          • #14
                            Test is more flexible. It has a lot of options, like doing Bonferroni adjustments. You can do things like

                            test x1 = 3

                            But, if you just want to test whether coefficients = 0, I find that testparm is usually easier.
                            -------------------------------------------
                            Richard Williams, Notre Dame Dept of Sociology
                            StataNow Version: 19.5 MP (2 processor)

                            EMAIL: [email protected]
                            WWW: https://www3.nd.edu/~rwilliam

                            Comment

                            Working...
                            X