Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mixed logit coefficient interpretation

    Hi,

    I am running a multi-level logistic regression with three levels and have some questions about interpreting and comparing coefficients.

    First, I am trying to interpret the odds ratios and marginal effects of my main predictor variable, which is the logged percent of mobile coverage in a locality (original distribution of percentages highly skewed). Would I interpret the marginal effect, which is 0.0129, as "a 1% increase in coverage is associated with a 0.1% increase in probability of y"? Further, I am interested in interpreting the effect of an interaction term with this variable and another percentage term that is not logged.

    Second, can the normal "Test" command be used with mixed logit to compare the significance of coefficient estimates?

    Thank you.

  • #2
    Your description of what you want is, to me, very confusing. I think it would be better if you showed the regression command you ran, the output of that regression, the -margins- command you used after that, and the output from -margins-. Then indicate which particular outputs you want help interpreting.

    As for your second question, it is meaningless to compare the significance of coefficient estimates. The difference between statistically significant and not statistically significant is not, itself, statistically significant, and there is no sensible test for comparing different p-values. What you can do, and perhaps this is what you meant, is you can test the difference between different coefficients in a model for statistical significance. If that is what you want, the -test- command works just as well after multilevel models as it does after single level models.

    Comment


    • #3
      Dear Clyde,

      I apologize for the ambiguity. Here is the code I ran:

      Code:
      eststo:melogit fooddummy gender hhsize secondary i.empstatus  lncoverage lnriots riotscoverage lnpop ln_ttime nlights_mean2013 harvarea ||country: ||gid:, covariance(unstructured)
      margins, dydyx(*) atmeans post
      I am attaching my margins command output because I do not have access to actual Stata output currently. I am having troubling interpreting the coefficients on "lncoverage" (which is the log of the percentage of mobile coverage in a given area) and the interaction term "riots coverage" between lncoverage lngridriots (log of the number of riots).

      I was in fact referring to the different between coefficients and apologize for the lack of clarity in my initial question. Thank you for the response. Click image for larger version

Name:	Screen Shot 2018-01-30 at 9.41.05 PM.png
Views:	1
Size:	35.5 KB
ID:	1428166

      Comment


      • #4
        I'm glad you showed the code and output. The commands are wrong and the output is not interpretable.

        The problem is that you didn't use factor variable notation in the regression command. Consequently, -margins- has no way of knowing that riotscoverage is the interaction between lnriots and lncoverage. So everything it computes treats it as an unrelated variable, which means everything in the -margins- output is wrong.

        So start over:

        Code:
        eststo: melogit fooddummy gender hhsize secondary i.empstatus c.lncoverage##c. lnriots ///
            lnpop ln_ttime nlights_mean2013 harvarea ||country: ||gid:, covariance(unstructured)
        Then, since your main variables are both continuous, you will have to select "interesting" values of them to evaluate the predicted probabilities and the marginal effects. For the sake of illustration here, I will pretend that the interesting values of lncoverage are -1 -.5 0 .5 and 1, and the interesting values of lnriots are 0 0.5 1 1.5 and 2. Then you could understand what your model is telling you by running
        Code:
        margins, at(lncoverage = (-1(0.5)1) lnriots = (0 (0.5) 2)) atmeans
        marginsplot, name("predicted_probabilities", replace)
        
        margins, dydx(lnriots) at(lncoverage = (-1(0.5)1) lnriots = (0 (0.5) 2)) atmeans
        marginsplot, name("marginal_effects_lnriots", replace)
        
        margins, dydx(lncoverage) at(lncoverage = (-1(0.5)1) lnriots = (0 (0.5) 2)) atmeans
        marginsplot, name("marginal_effects_lncoverage", replace)
        By looking at those graphs, you will be able to get a better sense of what your model says is going on here.






        Comment


        • #5
          Thanks for the prompt response. A couple of questions:

          1). Does the factor notation only affect the margins output, or are parameter estimates and odds ratios on the interactions terms also incorrect?
          2). For the non-interaction term, just the logged mobile coverage on its own, I evaluated the marginal effects at the variable mean. I realize I need to re-run the specification, but is the logged percentage when interpreted with a logged DV the same as the elasticity interpretation?

          Thank you.

          Comment


          • #6
            1. The factor variable notation does not affect anything substantive in the regression output (though it does change the labeling in some cases). The coefficients and odds ratios are all right.

            2. Well, when you have a continuous outcome, y, and a continuous predictor, x, the definition of elasticity is d log y/d log x, so if you regress log y on log x, the coefficient is the elasticity. But you have a dichotomous outcome here, so you can't log transform it. So there is no such thing as the elasticity of y with respect to x in this case. Now, perhaps you are thinking of d log Prob(y)/d log x. One might call that the elasticity of the probability of y with respect to x, I suppose. I don't see that idea used much, if at all. Perhaps in other fields, not in mine. (Then again, in my field we hardly ever use elasticity at all.) If you want that, you would have to modify the margins commands from -dydx()- to -eydx()-.

            Added: I just realized that my choice of notation with y and x is a bit confusing here. The terms dydx and eydx are both the names of options in the -margins- command and are written that way regardless of what the variables involved are. But the y and dydx or eydx refers to the probability of a non-zero outcome, and the x in dydx or eydx refers to the predictor variable in the model. Ordinarily eydx would be called the semi-elasticity. But in your case, the predictor in the model is, itself, the log transform of a variable, and you are interested in the elasticity with respect to that variable. So eydx, the semi-elasticity of the model predictor corresponds to eyex, the elasticity, with respect to the variable, of which the model predictor is already the log transform. In equations, if x = log u is the predictor in a regression model of outcome y, then eydx is the semi-elasticity with respect to x and is also the elasticity with respect to u.
            Last edited by Clyde Schechter; 30 Jan 2018, 23:29.

            Comment


            • #7
              Thanks for the response regarding interpretation.

              Another question back to the interaction term: The main intuition is that riots affect the binary outcome, and the spread of mobile coverage impacts riots. Though the data cannot make any claims about causality, would another reasonable approach be to run a two-stage least squares (not IV) given the difficulty in interpreting the interaction in this model? I am not sure if this is common for this type of problem.

              Comment


              • #8
                Sorry, but that's territory I don't know much about and can't advise you. If you are looking for a model that would attempt to look at a path of coverage -> riots -> binary outcome, my first thought would be to look at -gsem-, where you could do a full-blown path model.

                Comment


                • #9
                  Oh that is very helpful. Thanks so much for your help through this.

                  Comment


                  • #10
                    Hi Clyde,

                    Last question: I've been running the command you specified earlier:

                    Code:
                     
                     eststo: melogit fooddummy gender hhsize secondary i.empstatus c.lncoverage##c. lnriots ///     lnpop ln_ttime nlights_mean2013 harvarea ||country: ||gid:, covariance(unstructured)
                    but am not getting a coefficient term at all for the interaction. I know for substantive interpretation, I need to plot the margins, but should I still be getting a coefficient term?

                    Thanks.

                    Comment


                    • #11
                      Clarification: I have a parameter estimate coefficient but no interaction term.

                      Comment


                      • #12
                        You should be getting output for the interaction term. If Stata has dropped the interaction term because it is colinear with something, it will tell you that in the -melogit- output and explain why. I think you need to show the exact command and exact complete -melogit- output here so we can see what is going on. Do this by copying directly from the Results window or your log file and pasting here into the forum editor in code delimiters. Please do not edit in any way: there are no "minor" details.

                        Comment


                        • #13
                          Thanks, Clyde.

                          Here is my exact code:
                          Code:
                          eststo:melogit fooddummy gender hhsize secondary i.empstatus c.ln_gridriots##c.lncoverage lnpop ln_ttime nlights_mean2013 harvarea ||country: ||gid:, covariance(unstructured)
                          estimates store logit_riots
                          Here is the output:
                          Code:
                          . eststo:melogit fooddummy gender hhsize secondary i.empstatus c.ln_gridriots##
                          > c.lncoverage lnpop ln_ttime nlights_mean2013 harvarea ||country: ||gid:, cova
                          > riance(unstructured)
                          
                          Fitting fixed-effects model:
                          
                          Iteration 0:   log likelihood = -28931.523  
                          Iteration 1:   log likelihood = -28910.488  
                          Iteration 2:   log likelihood = -28910.478  
                          Iteration 3:   log likelihood = -28910.478  
                          
                          Refining starting values:
                          
                          Grid node 0:   log likelihood = -27020.925
                          
                          Fitting full model:
                          
                          Iteration 0:   log likelihood = -27020.925  (not concave)
                          Iteration 1:   log likelihood = -27016.728  (backed up)
                          Iteration 2:   log likelihood = -26939.173  
                          Iteration 3:   log likelihood = -26925.409  
                          Iteration 4:   log likelihood = -26925.278  
                          Iteration 5:   log likelihood = -26925.277  
                          
                          Mixed-effects logistic regression               Number of obs     =     44,288
                          
                          -------------------------------------------------------------
                                          |     No. of       Observations per Group
                           Group Variable |     Groups    Minimum    Average    Maximum
                          ----------------+--------------------------------------------
                                  country |         32        120    1,384.0      2,400
                                      gid |      1,632          1       27.1        558
                          -------------------------------------------------------------
                          
                          Integration method: mvaghermite                 Integration pts.  =          7
                          
                                                                          Wald chi2(13)     =    1274.90
                          Log likelihood = -26925.277                     Prob > chi2       =     0.0000
                          ------------------------------------------------------------------------------
                             fooddummy |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                gender |  -.0434915   .0214887    -2.02   0.043    -.0856085   -.0013744
                                hhsize |  -.0161911   .0045112    -3.59   0.000    -.0250329   -.0073493
                             secondary |   .7651824   .0265196    28.85   0.000     .7132048    .8171599
                                       |
                             empstatus |
                                    1  |  -.3029409   .0294944   -10.27   0.000    -.3607488    -.245133
                                    2  |  -.1333352   .0374033    -3.56   0.000    -.2066444   -.0600261
                                    3  |   .2222373   .0308052     7.21   0.000     .1618602    .2826144
                                       |
                          ln_gridriots |  -.3477571   .4017664    -0.87   0.387    -1.135205    .4396907
                            lncoverage |   .0607857   .0311374     1.95   0.051    -.0002425     .121814
                                       |
                                    c. |
                          ln_gridriots#|
                          c.lncoverage |   .0773786   .0891297     0.87   0.385    -.0973124    .2520696
                                       |
                                 lnpop |   .0122742   .0289406     0.42   0.671    -.0444483    .0689967
                              ln_ttime |  -.2282592   .0580592    -3.93   0.000    -.3420532   -.1144652
                          nlights~2013 |   .0007555   .0084652     0.09   0.929     -.015836     .017347
                              harvarea |  -5.35e-06   1.59e-06    -3.36   0.001    -8.47e-06   -2.23e-06
                                 _cons |   .9569818    .557426     1.72   0.086    -.1355532    2.049517
                          -------------+----------------------------------------------------------------
                          country      |
                             var(_cons)|   .5228222   .1433091                       .305516    .8946931
                          -------------+----------------------------------------------------------------
                          country>gid  |
                             var(_cons)|   .5005663    .031692                      .4421505    .5666998
                          ------------------------------------------------------------------------------
                          LR test vs. logistic model: chi2(2) = 3970.40             Prob > chi2 = 0.0000
                          
                          Note: LR test is conservative and provided only for reference.
                          (est1 stored)
                          
                          . estimates store logit_riots

                          Then I run margins and get:
                          Code:
                          . margins, dydx(*) atmeans post
                          
                          Conditional marginal effects                    Number of obs     =     44,288
                          Model VCE    : OIM
                          
                          Expression   : Marginal predicted mean, predict()
                          dy/dx w.r.t. : gender hhsize secondary 1.empstatus 2.empstatus 3.empstatus
                                         ln_gridriots lncoverage lnpop ln_ttime nlights_mean2013
                                         harvarea
                          at           : gender          =    .5034998 (mean)
                                         hhsize          =    4.024408 (mean)
                                         secondary       =    .3088421 (mean)
                                         0.empstatus     =     .379132 (mean)
                                         1.empstatus     =    .2383038 (mean)
                                         2.empstatus     =    .1187229 (mean)
                                         3.empstatus     =    .2638412 (mean)
                                         ln_gridriots    =     .789992 (mean)
                                         lncoverage      =    4.283509 (mean)
                                         lnpop           =    12.61969 (mean)
                                         ln_ttime        =    5.242129 (mean)
                                         nlights~2013    =    3.445333 (mean)
                                         harvarea        =    18366.81 (mean)
                          
                          ------------------------------------------------------------------------------
                                       |            Delta-method
                                       |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                gender |  -.0089079   .0044053    -2.02   0.043    -.0175422   -.0002736
                                hhsize |  -.0033163   .0009257    -3.58   0.000    -.0051306    -.001502
                             secondary |   .1567249   .0064839    24.17   0.000     .1440167    .1694332
                                       |
                             empstatus |
                                    1  |  -.0622637    .006208   -10.03   0.000    -.0744311   -.0500962
                                    2  |  -.0273551   .0077058    -3.55   0.000    -.0424582    -.012252
                                    3  |   .0450026   .0063153     7.13   0.000     .0326249    .0573803
                                       |
                          ln_gridriots |  -.0033396   .0087805    -0.38   0.704    -.0205491    .0138698
                            lncoverage |   .0249705   .0153476     1.63   0.104    -.0051102    .0550512
                                 lnpop |    .002514   .0059238     0.42   0.671    -.0090964    .0141244
                              ln_ttime |  -.0467521    .011988    -3.90   0.000    -.0702482    -.023256
                          nlights~2013 |   .0001547   .0017345     0.09   0.929    -.0032448    .0035543
                              harvarea |  -1.10e-06   3.27e-07    -3.35   0.001    -1.74e-06   -4.54e-07
                          ------------------------------------------------------------------------------
                          Note: dy/dx for factor levels is the discrete change from the base level.

                          which does not produce an effect for the interaction term.

                          Comment


                          • #14
                            Oh, I thought you meant you weren't getting a regression coefficient for the interaction term. You're talking about the -margins- output. No, there isn't supposed to be a marginal effect for the interaction term, because there is no such thing as a marginal effect for an interaction. The effect of the interaction(but not something labeled with the interaction term itself) shows up in the marginal effects when you calculate the marginal effect of ln_gridriots and different values of ln_coverage (or vice versa). To capture how the interaction term affects your outcomes, you need to pick interesting values of ln_gridriots and ln_coverage and then get marginal effects of those variables at those values. Take a look at #4 for an illustration of how this is done. Then you will see that the marginal effect of ln_gridriots depends on the value of ln_coverage specified, and vice versa. But interaction terms themselves do not have a marginal effect. It is a common misunderstanding to think that they do.

                            Thanks for posting the output: it really cleared up the question!

                            Comment


                            • #15
                              Ah, thank you for the clarification. I guess I hadn't fully understood your earlier explanation but that helps my conceptualize what the interaction is doing in the first place much better.

                              Many thanks for continual help with this question.

                              Comment

                              Working...
                              X