Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Margins after xtmelogit

    Dear all,

    I run a multilevel logit model with the command 'xtmelogit' and want to calculate margins afterwards. I constantly get an error, however:

    "missing predicted values encountered within the estimation sample" I have seen the response/solution from Jeff Pitblado (https://www.statalist.org/forums/for...-stata-margins), but this seems to be a solution to a tobit model. Can someone advise me on what I need to do? The model I run looks like this:

    xtmelogit depvar indvar controlvars || country: indvar, mle var cov(unstr)

    Many thanks in advance,

    Kris

  • #2
    Have you tried adapting Jeff Pitblado's response to your situation? Instead of -predict-ing ystar, just -predict- mu, and then do the same things as in the link you provide, replacing ystar by mu. I see no reason it wouldn't work.

    By the way, unless you are using an old version of Stata, the current name for what you are calling -xtmelogit- is -meqrlogit-. Stata still recognizes the earlier name, but at some point it may stop doing so, so best to use the current terminology.

    Comment


    • #3
      Dear Clyde,

      Many thanks for your help, and sorry for my ignorant questions.

      The solution seems to work, but when I predict mu, I lose about 47000 of my 50000 observations that were part of the estimated model. Any idea on why this is? And how I could prevent that. Many thanks again, your help is much appreciated.

      Comment


      • #4
        I am hard pressed to think of a reason why -predict- would fail to predict mu on any observations that were part of the original estimated model. Can you post the actual complete output you got from -meqrlogit- (or -melogit-, if that's what you ran) and any messages you got from Stata following -predict-. Out-of-sample predictions are not possible, but a prediction should be possible for any observation that is in the estimation sample.

        Also, if you can parse out a small sample of your data that reproduces this problem and use -dataex- to show that when you post back, that would probably be helpful, too. If you are running version 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Note: When I ask for the actual output, I mean the complete and unedited response that Stata gave you in the Results windows. Please don't show results that have been further processed by -esttab- or -outreg- or other such formatting programs: they are not as helpful.

        Comment


        • #5
          So here is the exact output:

          xtmelogit q37binnew q45 level3 se2 se3a se5 se9 q43 elections fh gdppercapita corrupt pts
          > || country: q45, mle var cov(unstr)

          Refining starting values:

          Iteration 0: log likelihood = -24835.743 (not concave)
          Iteration 1: log likelihood = -24814.464
          Iteration 2: log likelihood = -24657.996

          Performing gradient-based optimization:

          Iteration 0: log likelihood = -24657.996
          Iteration 1: log likelihood = -24559.151
          Iteration 2: log likelihood = -24501.456
          Iteration 3: log likelihood = -24491.848
          Iteration 4: log likelihood = -24489.076
          Iteration 5: log likelihood = -24488.874
          Iteration 6: log likelihood = -24488.873

          Mixed-effects logistic regression Number of obs = 40416
          Group variable: country Number of groups = 16

          Obs per group: min = 886
          avg = 2526.0
          max = 4486

          Integration points = 7 Wald chi2(12) = 763.57
          Log likelihood = -24488.873 Prob > chi2 = 0.0000

          ------------------------------------------------------------------------------
          q37binnew | Coef. Std. Err. z P>|z| [95% Conf. Interval]
          -------------+----------------------------------------------------------------
          q45 | -.0753204 .0334112 -2.25 0.024 -.1408052 -.0098356
          level3 | -.1690492 .0244839 -6.90 0.000 -.2170369 -.1210616
          se2 | .1705215 .0228523 7.46 0.000 .1257319 .2153111
          se3a | .0063606 .0008598 7.40 0.000 .0046754 .0080458
          se5 | -.0068202 .0121726 -0.56 0.575 -.030678 .0170377
          se9 | .0488059 .0238373 2.05 0.041 .0020856 .0955262
          q43 | .0381727 .0115362 3.31 0.001 .0155623 .0607832
          elections | .0995321 .0335796 2.96 0.003 .0337173 .1653468
          fh | -.3449709 .0921981 -3.74 0.000 -.5256758 -.164266
          gdppercapita | -.0005423 .0000777 -6.98 0.000 -.0006945 -.0003901
          corrupt | 2.503162 .1190163 21.03 0.000 2.269894 2.73643
          pts | -.1279449 .0639624 -2.00 0.045 -.2533088 -.002581
          _cons | -2.604731 .9337329 -2.79 0.005 -4.434814 -.7746484
          ------------------------------------------------------------------------------

          ------------------------------------------------------------------------------
          Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
          -----------------------------+------------------------------------------------
          country: Unstructured |
          var(q45) | .0158934 .0068146 .0068588 .0368288
          var(_cons) | 8.940206 3.695588 3.976387 20.10048
          cov(q45,_cons) | -.2280286 .1201598 -.4635374 .0074801
          ------------------------------------------------------------------------------
          LR test vs. logistic regression: chi2(3) = 3062.52 Prob > chi2 = 0.0000

          Note: LR test is conservative and provided only for reference.

          .------------------------------------------------------------------------------------------------------------------------------------------
          predict x

          (53121 missing values generated)
          (option mu assumed; predicted means)

          . sum x

          Variable | Obs Mean Std. Dev. Min Max
          -------------+--------------------------------------------------------
          x | 3328 .6704553 .2141378 .0060365 .829378


          Comment


          • #6
            So I go from 40416 observations to 3328 obersvations when I use predict.
            Thanks thus far, your help is much appreciated!!!

            I hope this works:

            dataex q37binnew q45 level3 se2 se3a se5 se9 q43 elections fh gdppercapita corrupt pts in 1/25
            Last edited by Kris Ruijgrok; 28 Nov 2019, 08:28.

            Comment


            • #7
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input float q37binnew double(q45 level3 se2 se3a se5 se9 q43) float(elections fh gdppercapita corrupt pts)
              . 5 1 2 20 3 0 3 0 6.5 9290 3.5 4
              0 2 1 2 42 3 1 2 0 6.5 9290 3.5 4
              . 2 1 2 25 3 1 2 0 6.5 9290 3.5 4
              . 5 1 1 23 3 0 3 0 6.5 9290 3.5 4
              . 2 1 2 25 3 1 2 0 6.5 9290 3.5 4
              1 1 0 1 62 2 1 2 0 6.5 9290 3.5 4
              . 2 0 1 32 2 1 2 0 6.5 9290 3.5 4
              1 1 0 1 66 2 1 3 0 6.5 9290 3.5 4
              . 5 0 1 38 3 1 2 0 6.5 9290 3.5 4
              . 1 0 1 60 2 1 2 0 6.5 9290 3.5 4
              . 1 0 1 79 1 1 2 0 6.5 9290 3.5 4
              1 5 0 2 31 3 1 2 0 6.5 9290 3.5 4
              . 4 1 1 49 3 1 3 0 6.5 9290 3.5 4
              1 5 0 1 42 3 1 3 0 6.5 9290 3.5 4
              1 1 0 1 45 1 1 2 0 6.5 9290 3.5 4
              1 1 0 1 50 2 1 2 0 6.5 9290 3.5 4
              . 5 0 1 43 3 1 3 0 6.5 9290 3.5 4
              1 4 0 1 33 3 1 3 0 6.5 9290 3.5 4
              . 2 1 2 26 2 0 3 0 6.5 9290 3.5 4
              . 1 1 1 39 1 1 2 0 6.5 9290 3.5 4
              1 1 1 1 49 3 1 3 0 6.5 9290 3.5 4
              1 1 1 1 59 3 1 3 0 6.5 9290 3.5 4
              0 5 1 1 40 5 1 2 0 6.5 9290 3.5 4
              1 1 0 1 44 2 1 2 0 6.5 9290 3.5 4
              1 1 1 1 40 3 1 2 0 6.5 9290 3.5 4
              end
              label values q45 blalab
              label def blalab 1 "Never", modify
              label def blalab 2 "Hardly ever/few times a year", modify
              label def blalab 4 "At least once a week", modify
              label def blalab 5 "Almost daily", modify
              label values level3 level3
              label def level3 1 "urban", modify
              label values se2 se2
              label def se2 1 "Male", modify
              label def se2 2 "Female", modify
              label values se5 se5
              label def se5 1 "no education/incomplete primary", modify
              label def se5 2 "Complete primary/incomplete secondary", modify
              label def se5 3 "Complete Secondary/Vocational type", modify
              label def se5 5 "MA and above", modify
              label values se9 se9
              label def se9 1 "Employed", modify
              label values q43 q43
              label def q43 2 "Somewhat interested", modify
              label def q43 3 "Not very interested", modify

              Comment


              • #8
                Thanks. I don't see anything in the output that looks problematic.

                Your example data isn't able to run the code: you don't have a country variable in it. Also, I notice that in your example data, when level3 != 1, you always have q37binnew = 1, and also, q45 at several values also perfectly predicts q37binnew. I suppose that in your real data set these things do not hold, and that it's just a coincidence in a very small sample.

                So I'm a bit stymied at this point. Here's a few things you can do:

                1. First, change -xtmelogit- to its proper current name -meqrlogit-. It shouldn't make a difference, but the postestimation commands do check the name of the estimation command that ran before it.

                2. Try -predict x, fixedonly-. It's not the result you want, but if it gives complete results where -predict x- does not, then it would tell us that the problem is somehow arising with the random effects, and we could focus further troubleshooting efforts there, starting with running -predict- with -reffects- specified and looking for missing values of the random effects.

                Comment

                Working...
                X