Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multinomial Logit when dependent variable can be positive, negative, or zero

    Hi guys,

    I have a categorical variable with four options: (i) no planning, (ii) only health planning, (iii) only financial planning, and (iv) both plannings. I want to run a multinomial logit model using wealth as an independent variable. Wealth in my data is measured as assets minus debts, so it can be positive, negative, or zero. I then use the following command:

    Code:
    . mlogit genplan wealth if (year == 2012) [pweight=rwtresp], baseoutcome(1)
    
    Multinomial logistic regression                   Number of obs   =       9601
                                                      Wald chi2(3)    =     131.26
                                                      Prob > chi2     =     0.0000
    Log pseudolikelihood =  -45455497                 Pseudo R2       =     0.0494
    
    --------------------------------------------------------------------------------
                   |               Robust
           genplan |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
    No_Planning    |  (base outcome)
    ---------------+----------------------------------------------------------------
    Only_Health    |
            wealth |  -7.17e-07   3.48e-07    -2.06   0.039    -1.40e-06   -3.55e-08
             _cons |  -1.112479    .076825   -14.48   0.000    -1.263053   -.9619045
    ---------------+----------------------------------------------------------------
    Only_Financial |
            wealth |   1.73e-06   1.75e-07     9.93   0.000     1.39e-06    2.08e-06
             _cons |  -1.018951   .0575423   -17.71   0.000    -1.131732   -.9061704
    ---------------+----------------------------------------------------------------
    Both_Plannings |
            wealth |   1.73e-06   1.74e-07     9.94   0.000     1.39e-06    2.07e-06
             _cons |   .0092796   .0505876     0.18   0.854    -.0898703    .1084296
    --------------------------------------------------------------------------------
    As you can see, the coefficients are zero. By this result, my understanding is that wealth has almost no effect.

    However, I then run a second regression using a categorical variable for wealth, where I group wealth into four categories based on the quartiles of the distribution. Here are the results:

    Code:
    . mlogit genplan i.q4wealth if (year == 2012) [pweight=rwtresp], baseoutcome (1)
    
    Multinomial logistic regression                   Number of obs   =       9601
                                                      Wald chi2(9)    =    1032.85
                                                      Prob > chi2     =     0.0000
    Log pseudolikelihood =  -44175787                 Pseudo R2       =     0.0762
    
    --------------------------------------------------------------------------------
                   |               Robust
           genplan |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    ---------------+----------------------------------------------------------------
    No_Planning    |  (base outcome)
    ---------------+----------------------------------------------------------------
    Only_Health    |
          q4wealth |
     2nd Quartile  |  -.1316657   .1147428    -1.15   0.251    -.3565574     .093226
     3rd Quartile  |  -.4551799   .1646513    -2.76   0.006    -.7778905   -.1324693
     4th Quartile  |  -.1941912   .2018469    -0.96   0.336     -.589804    .2014215
                   |
             _cons |  -1.129302   .0685136   -16.48   0.000    -1.263586   -.9950174
    ---------------+----------------------------------------------------------------
    Only_Financial |
          q4wealth |
     2nd Quartile  |   1.141276   .1107899    10.30   0.000     .9241312     1.35842
     3rd Quartile  |   1.777783    .116215    15.30   0.000     1.550006    2.005561
     4th Quartile  |   2.314921   .1310731    17.66   0.000     2.058022    2.571819
                   |
             _cons |  -1.593136   .0850319   -18.74   0.000    -1.759795   -1.426476
    ---------------+----------------------------------------------------------------
    Both_Plannings |
          q4wealth |
     2nd Quartile  |   .9827859   .0802326    12.25   0.000     .8255329    1.140039
     3rd Quartile  |   1.727817    .088295    19.57   0.000     1.554762    1.900872
     4th Quartile  |   2.604634   .1040449    25.03   0.000      2.40071    2.808559
                   |
             _cons |   -.604249   .0567623   -10.65   0.000    -.7155011   -.4929969
    --------------------------------------------------------------------------------
    Now the results are quite different. It is clear that the wealthier groups are more likely to do financial or both plannings. This result is reasonable and expected in my opinion.

    Does anybody know why the results are so different in both regressions? Am I missing something?

    Thanks!
    Last edited by Diego Gomes; 24 Oct 2018, 22:43. Reason: Adding tags

  • #2
    Originally posted by Diego Gomes View Post
    As you can see, the coefficients are zero.
    No they're not. Divide your wealth variable by 107, refit your model and report back to the list.

    Comment


    • #3
      Hi Joseph,

      Thanks for the reply. I created the wealth2 variable. Here is the result of your request:

      Code:
      . mlogit genplan wealth2 if (year == 2012) [pweight=rwtresp], baseoutcome(1)
      
      Multinomial logistic regression                   Number of obs   =       9601
                                                        Wald chi2(3)    =     131.26
                                                        Prob > chi2     =     0.0000
      Log pseudolikelihood =  -45455497                 Pseudo R2       =     0.0494
      
      --------------------------------------------------------------------------------
                     |               Robust
             genplan |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
      ---------------+----------------------------------------------------------------
      No_Planning    |  (base outcome)
      ---------------+----------------------------------------------------------------
      Only_Health    |
             wealth2 |  -7.172241   3.478485    -2.06   0.039    -13.98995   -.3545361
               _cons |  -1.112479    .076825   -14.48   0.000    -1.263053   -.9619045
      ---------------+----------------------------------------------------------------
      Only_Financial |
             wealth2 |   17.34878   1.746351     9.93   0.000     13.92599    20.77156
               _cons |  -1.018951   .0575423   -17.71   0.000    -1.131732   -.9061704
      ---------------+----------------------------------------------------------------
      Both_Plannings |
             wealth2 |   17.29581   1.740808     9.94   0.000     13.88389    20.70773
               _cons |   .0092796   .0505876     0.18   0.854    -.0898703    .1084296
      --------------------------------------------------------------------------------
      Now the coefficients are reasonable to me. Do you think is a good practice to normalize continuous variables before running logits (like subtracting the mean)?

      Thanks!

      Comment


      • #4
        Originally posted by Diego Gomes View Post
        Do you think is a good practice to normalize continuous variables before running logits (like subtracting the mean)?
        Mean centering of predictors is topical: see this recent thread for some advice from list members.

        Comment


        • #5
          Thank you! I'll have a look. If I have any new question about it I reply to you.

          Comment


          • #6
            Originally posted by Diego Gomes View Post
            Hi Joseph,

            Thanks for the reply. I created the wealth2 variable. Here is the result of your request:

            Code:
            . mlogit genplan wealth2 if (year == 2012) [pweight=rwtresp], baseoutcome(1)
            
            Multinomial logistic regression Number of obs = 9601
            Wald chi2(3) = 131.26
            Prob > chi2 = 0.0000
            Log pseudolikelihood = -45455497 Pseudo R2 = 0.0494
            
            --------------------------------------------------------------------------------
            | Robust
            genplan | Coef. Std. Err. z P>|z| [95% Conf. Interval]
            ---------------+----------------------------------------------------------------
            No_Planning | (base outcome)
            ---------------+----------------------------------------------------------------
            Only_Health |
            wealth2 | -7.172241 3.478485 -2.06 0.039 -13.98995 -.3545361
            _cons | -1.112479 .076825 -14.48 0.000 -1.263053 -.9619045
            ---------------+----------------------------------------------------------------
            Only_Financial |
            wealth2 | 17.34878 1.746351 9.93 0.000 13.92599 20.77156
            _cons | -1.018951 .0575423 -17.71 0.000 -1.131732 -.9061704
            ---------------+----------------------------------------------------------------
            Both_Plannings |
            wealth2 | 17.29581 1.740808 9.94 0.000 13.88389 20.70773
            _cons | .0092796 .0505876 0.18 0.854 -.0898703 .1084296
            --------------------------------------------------------------------------------
            Now the coefficients are reasonable to me. Do you think is a good practice to normalize continuous variables before running logits (like subtracting the mean)?

            Thanks!
            I agree with Joseph that the wisdom and acceptability of mean centering is something that varies by topic. I'd just add that you want wealth to be on some sort of reasonable scale. I assume wealth was denominated in dollars originally. So, each additional dollar of net wealth was associated with a -7.17e-07 lower log odds of engaging in only health planning (relative to no planning). That's a bit hard to interpret. Joseph had you redenominate wealth in units of $1^e7 dollars, which I think is ten million dollars. Maybe that's also a bit odd! However, it did clearly demonstrate that the coefficients on the effects of wealth treated continuously were not zero. Mean centering may be irrelevant for this problem if there isn't a natural and easily understood value of mean wealth. If the distribution of wealth is very right skewed, then most people don't have anywhere near the mean level of wealth.
            Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

            When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

            Comment


            • #7
              Thanks, Weiwen!

              Yes, the distribution is very skewed, and I agree to your point. I was planning just to put the wealth variable on a different scale (like dividing by 100,000). Do you have any other suggestion?

              Comment


              • #8
                Originally posted by Diego Gomes View Post
                Thanks, Weiwen!

                Yes, the distribution is very skewed, and I agree to your point. I was planning just to put the wealth variable on a different scale (like dividing by 100,000). Do you have any other suggestion?
                Not really, this isn't my field. I would probably choose some sort of sensible increment of wealth. $100k seems sensible enough to me. Also, the -margins- command can really help you interpret the log odds.
                Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                Comment

                Working...
                X