Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Predicted probabilities with margins after meologit/ meoprobit

    Dear statalists,

    I wish to dereive predicted probabilities after estimating an ordered probit/logit multilevel regression. I have tried to dereive predicted probabilities relying on margins, however, only with limited success. I could only dereive predicted probablities for factor variables but not for continous variables. For reasons unknown to me, I am not able to estimate margins AT specific values of the covariates after MEOLOGIT or MEOPROBIT.

    My dependent variable has four categories, so I run margins for each outcome sepertatly. I also instructed margins to get the predicted probabilities only from the fixed part of the model. --> predict(mu fixedonly outcome(#1)). Finally, I wish to keep all other variables at their means.

    Click image for larger version

Name:	Meoprobit.jpg
Views:	1
Size:	204.6 KB
ID:	879876


    Running margins after my model I am able to dereive the predicted probablities for factor variables (here for outcome 1/ not at all satisfied). Similarly, I am also able to dereive the marginal effect for one unit increase (...looks good):


    Click image for larger version

Name:	Margins1 Meoprobit.jpg
Views:	1
Size:	146.6 KB
ID:	879880


    My problem starts when I seek to dereive the predicted probablities not by placing Gender directly after margins but when I seek to hold the values of gender AT 0 and 1. I cannot just go with the first solution since I also wish to dereive predicted probabilities accross a range of observed values for some continous variables as well (Income).

    Click image for larger version

Name:	Margins3 Meoprobit.jpg
Views:	1
Size:	91.0 KB
ID:	879877


    Here, margins returns the same values for both males and females. I have no idea why. Any suggestions?

    I find it even more oddly since margins would return the correct probabilities with AT option after specifiying an ordered probit regression, so discarding the multilevel structure of my data for a while. See below:

    Click image for larger version

Name:	oprobit.jpg
Views:	1
Size:	116.8 KB
ID:	879878


    Now running margins after oprobit returns the same results:

    Click image for larger version

Name:	Margins1 oprobit.jpg
Views:	1
Size:	160.4 KB
ID:	879879



    Any suggestions what might go wrong in the multilevel model? I need to get the second AT - option running for:
    a) creating predicted probability graphs, especially for my continuous predictors
    b) calculating the PR-change from the minimum to the maximum value for the continous variables. Any other suggestion here? I am also in search of a possibility to create the respective CI intervals for this.
    c.) finally, I would wish to simplify the interpretation of some of my main explanatories by collapsing the predicted probabilities from two categories (not at all satisfied/not satisfied) into one (not satisfied and show the predicted probabilities + PR-change + CI.

    I know this was much, I would be glad for any help or suggestions.

    Best,
    Pablo

  • #2
    Problem resolved.

    In fact, the AT opion works after meologit/meoprobit BUT ONLY FOR CONTINOUS VARIABLES.

    The bug - I consider it as such because it would otherwise work perfectly after ologit/oprobit - only occurs when one tries to use AT with FACTOR VARIALBES. This, however, is not much of a problem since you could dereive the predicted probabiltities after placing the factor variable directly after margins:

    margins Gender, atmeans predict(outcome(#1))

    Furthermore, the AT option also works also with categorical variables when you do not tell meologit/meoprobit to treat them as FACTOR VARIABLES in the first place by adding the prefix i. or b0. before the variable name. Here an example where I dropped the prefix i. before gender.

    Click image for larger version

Name:	Meoprobit_2.jpg
Views:	1
Size:	211.1 KB
ID:	886900


    Now margins returns the predicted probabilities for gender as required (it would not if we would specify it as a factor variable):



    However, the bug will continue to cause problems once somebody seeks to estimate interactions for which FACTOR variables need to be specified and one seeks to estimate the average marginal effect of one variables X AT specific values of the conditioning variable Z:



    The marginal effect won't change accross the values of Z:


    Again we can dereive the correct marginal effects after placing the Factor variable directly after margins:

    Click image for larger version

Name:	Margins Meoprobit_final.jpg
Views:	1
Size:	89.4 KB
ID:	886899




    Comment


    • #3
      Pablo, I cannot reproduce this problem with the dataset used in the manual entry for meologit.
      Please send a do-file and dataset that reproduces this behavior to Tech Support and we will try to fix
      margins.

      Comment


      • #4
        Firstly, it is very hard to read your output, since it is very small. Did you put [HTML] wrapper around your copied Stata output? The post would look like that:

        HTML Code:
        Fitting fixed-effects model:
        
        Iteration 0:   log likelihood = -34672,325  
        Iteration 1:   log likelihood = -32540,426  
        Iteration 2:   log likelihood = -32515,724  
        Iteration 3:   log likelihood = -32515,669  
        Iteration 4:   log likelihood = -32515,669  
        
        Refining starting values:
        
        Grid node 0:   log likelihood = -32734,732
        
        Fitting full model:
        
        Iteration 0:   log pseudolikelihood = -32734,732  (not concave)
        Iteration 1:   log pseudolikelihood = -32422,985  
        Iteration 2:   log pseudolikelihood = -32332,731  
        Iteration 3:   log pseudolikelihood = -32317,446  
        Iteration 4:   log pseudolikelihood = -32317,393  
        Iteration 5:   log pseudolikelihood = -32317,393  
        Secondly, it is very important that you use factor variables in your meologit-command if you want to use margins. You find information on that matter in the following link (page 10-20; but study all of it ):

        https://www3.nd.edu/~rwilliam/stats/Margins01.pdf.

        It is written by Richard Williams who is an active member of this forum and very helpful.

        Thirdly, I am estimating ordinal multilevel models as well, so I know your pain regarding margins/predicted probabilities

        Comment


        • #5
          Dear Jeff,

          I could replicate the problem with the dataset used in the manual entry for meoprobit.

          Type in:

          webuse tvsfpors

          meoprobit thk c.prethk i.cc i.tv || school: ,intpoints(7) intmethod(mcaghermite)

          Click image for larger version

Name:	Model1.jpg
Views:	1
Size:	134.8 KB
ID:	888433




          margins , at(cc=(0 1)) predict(mu fixedonly outcome(1)) atmeans


          Click image for larger version

Name:	Margin1.jpg
Views:	1
Size:	57.0 KB
ID:	888434


          There is the bug again. It should produce:

          Click image for larger version

Name:	Margin2.jpg
Views:	1
Size:	51.5 KB
ID:	888435


          In fact it works properly if we DO NOT USE FACTOR VARIABLES. See below:


          meoprobit thk prethk cc tv || school: ,intpoints(7) intmethod(mcaghermite)

          margins , at(cc=(0 1)) predict(mu fixedonly outcome(1)) atmeans

          Click image for larger version

Name:	Margin3.jpg
Views:	1
Size:	48.4 KB
ID:	888436


          Hope this helps!

          Best,
          Pablo

          Comment


          • #6
            There is something strange going on! I copied your syntax (see below) and it works fine.

            HTML Code:
            . webuse tvsfpors
            
            . meoprobit thk c.prethk i.cc i.tv || school: ,intpoints(7) intmethod(mcaghermite)
            
            Fitting fixed-effects model:
            
            Iteration 0:   log likelihood =  -2212,775  
            Iteration 1:   log likelihood = -2130,0541  
            Iteration 2:   log likelihood =  -2130,012  
            Iteration 3:   log likelihood =  -2130,012  
            
            Refining starting values:
            
            Grid node 0:   log likelihood = -2149,8672
            
            Fitting full model:
            
            Iteration 0:   log likelihood = -2149,8672  (not concave)
            Iteration 1:   log likelihood = -2129,9021  (not concave)
            Iteration 2:   log likelihood = -2124,0545  
            Iteration 3:   log likelihood = -2123,2656  
            Iteration 4:   log likelihood = -2123,1749  
            Iteration 5:   log likelihood = -2123,1749  
            
            Mixed-effects oprobit regression                Number of obs      =      1600
            Group variable:          school                 Number of groups   =        28
            
                                                            Obs per group: min =        18
                                                                           avg =      57,1
                                                                           max =       137
            
            Integration method: mcaghermite                 Integration points =         7
            
                                                            Wald chi2(3)       =    123,40
            Log likelihood = -2123,1749                     Prob > chi2        =    0,0000
            ------------------------------------------------------------------------------
                     thk |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                  prethk |   ,2364698    ,022768    10,39   0,000     ,1918453    ,2810942
                    1.cc |   ,3977404   ,0905501     4,39   0,000     ,2202655    ,5752153
                    1.tv |   ,0256499   ,0903513     0,28   0,776    -,1514353    ,2027352
            -------------+----------------------------------------------------------------
                   /cut1 |  -,1423773   ,0935873    -1,52   0,128    -,3258051    ,0410506
                   /cut2 |   ,6025399   ,0938714     6,42   0,000     ,4185553    ,7865245
                   /cut3 |   1,315949   ,0965836    13,62   0,000     1,126649     1,50525
            -------------+----------------------------------------------------------------
            school       |
               var(_cons)|   ,0326451    ,016096                        ,01242     ,085805
            ------------------------------------------------------------------------------
            LR test vs. oprobit regression:  chibar2(01) =    13,67 Prob>=chibar2 = 0,0001
            
            . margins , at(cc=(0 1)) predict(mu fixedonly outcome(1)) atmeans
            
            Adjusted predictions                              Number of obs   =       1600
            Model VCE    : OIM
            
            Expression   : Predicted mean (1.thk), fixed portion only, predict(mu fixedonly outcome(1))
            
            1._at        : prethk          =    2,069375 (mean)
                           cc              =           0
                           0.tv            =     ,500625 (mean)
                           1.tv            =     ,499375 (mean)
            
            2._at        : prethk          =    2,069375 (mean)
                           cc              =           1
                           0.tv            =     ,500625 (mean)
                           1.tv            =     ,499375 (mean)
            
            ------------------------------------------------------------------------------
                         |            Delta-method
                         |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
            -------------+----------------------------------------------------------------
                     _at |
                      1  |   ,2596156   ,0215945    12,02   0,000     ,2172913      ,30194
                      2  |    ,148643   ,0162831     9,13   0,000     ,1167287    ,1805573
            ------------------------------------------------------------------------------

            Comment


            • #7
              Dear Julian,

              thank you for the resources although I have encountered them previously. Indeed I have no clue than what is going with my machine. I use Stata/SE 13.1 with a 64-bit system on Windows 7.

              Besides, I think it is only crucial to use factor variables when you use interactions or higher order terms with margins.

              Two more question:
              1.) Do you know a way to collapse categories of the outcome variable for predicted probabilities or the marginal effect?
              So instead of reporting the probabilties for very dissatisfied, dissatisfied, satisfied, very satisfied just for dissatisfied and satisfied (would result in a much more simpler table).
              I have seen this procedure in a journal article: http://www.tandfonline.com/doi/abs/1...82.2014.943524

              2.) Furthermore do you know a way of deriving PR change from the minimum value to the maximum value but also create a Confidence Interval for this?

              Best,
              Pablo
              Last edited by Pablo Christmann; 23 Feb 2015, 08:30.

              Comment


              • #8
                Pablo, are you sure that your Stata 13 is fully up-to-date?

                Code:
                . update q
                (contacting http://www.stata.com)
                
                Update status
                    Last check for updates:  23 Feb 2015
                    New update available:    none         (as of 23 Feb 2015)
                    Current update level:    19 Dec 2014  (what's new)
                
                Possible actions
                
                    Do nothing; all files are up to date.

                Comment


                • #9
                  Dear Jeff,
                  the update did the trick. Margins works now as required.
                  Sorry for not considering this before.
                  Best,
                  Pablo

                  Comment


                  • #10
                    Originally posted by Pablo Christmann View Post

                    Two more question:
                    1.) Do you know a way to collapse categories of the outcome variable for predicted probabilities or the marginal effect?
                    So instead of reporting the probabilties for very dissatisfied, dissatisfied, satisfied, very satisfied just for dissatisfied and satisfied (would result in a much more simpler table).
                    I have seen this procedure in a journal article: http://www.tandfonline.com/doi/abs/1...82.2014.943524

                    2.) Furthermore do you know a way of deriving PR change from the minimum value to the maximum value but also create a Confidence Interval for this?

                    Best,
                    Pablo
                    regarding Q (1):
                    I am not aware of any command which does that automatically. But, wouldn't this be the same as collapsing the first and second categories /third and fourth outcome-categories into one, respectively and just estimate a binary model (melogit) and the predicted probabilities subsequently? It is like adding up the predicted probabilities for one person to either respond with category 1 or 2 (3 or 4) of your outcome variable.

                    regarding Q (2):
                    What do you mean by "PR(?) change"?



                    Comment


                    • #11
                      Just because I just stumbled over it an we talked about it: To use factor variables in your model (if you have dummies or categorical variables) is also important when you want to calculate "marginal effects" with margin, dydx(). See margins manual page 23-26.

                      Comment


                      • #12
                        I believe the following example addresses Pablo's follow-up questions:

                        First the data:

                        Code:
                        . webuse nhanes2f
                        
                        . tab health
                        
                        1=poor,..., |
                        5=excellent |      Freq.     Percent        Cum.
                        ------------+-----------------------------------
                               poor |        729        7.05        7.05
                               fair |      1,670       16.16       23.21
                            average |      2,938       28.43       51.64
                               good |      2,591       25.07       76.71
                          excellent |      2,407       23.29      100.00
                        ------------+-----------------------------------
                              Total |     10,335      100.00
                        Next we fit an ordinal model.

                        Code:
                        . oprobit health female black c.age##c.age
                        
                        Iteration 0:   log likelihood = -15764.397  
                        Iteration 1:   log likelihood = -14912.438  
                        Iteration 2:   log likelihood = -14911.793  
                        Iteration 3:   log likelihood = -14911.793  
                        
                        Ordered probit regression                         Number of obs   =      10335
                                                                          LR chi2(4)      =    1705.21
                                                                          Prob > chi2     =     0.0000
                        Log likelihood = -14911.793                       Pseudo R2       =     0.0541
                        
                        ------------------------------------------------------------------------------
                              health |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                              female |  -.0645111   .0208939    -3.09   0.002    -.1054623   -.0235599
                               black |  -.5146098   .0339597   -15.15   0.000    -.5811696   -.4480499
                                 age |  -.0201136   .0044889    -4.48   0.000    -.0289117   -.0113155
                                     |
                         c.age#c.age |  -.0000463   .0000477    -0.97   0.331    -.0001397    .0000471
                        -------------+----------------------------------------------------------------
                               /cut1 |  -2.785774   .0983601                     -2.978556   -2.592991
                               /cut2 |  -1.968372   .0970133                     -2.158515    -1.77823
                               /cut3 |  -1.109796   .0961253                     -1.298198   -.9213939
                               /cut4 |  -.3588874   .0957267                     -.5465084   -.1712665
                        ------------------------------------------------------------------------------
                        Here is how you can compute the collapsed predicted probability for poor or fair health.

                        Code:
                        . margins, expression(predict(pr outcome(1)) + predict(pr outcome(2)))
                        
                        Predictive margins                                Number of obs   =      10335
                        Model VCE    : OIM
                        
                        Expression   : predict(pr outcome(1)) + predict(pr outcome(2))
                        
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                               _cons |   .2317375   .0039623    58.48   0.000     .2239715    .2395036
                        ------------------------------------------------------------------------------
                        Here is the the marginal effect of age on the above.

                        Code:
                        . margins, expression(predict(pr outcome(1)) + predict(pr outcome(2))) dydx(age
                        > )
                        
                        Average marginal effects                          Number of obs   =      10335
                        Model VCE    : OIM
                        
                        Expression   : predict(pr outcome(1)) + predict(pr outcome(2))
                        dy/dx w.r.t. : age
                        
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                                 age |   .0069517   .0002322    29.94   0.000     .0064967    .0074068
                        ------------------------------------------------------------------------------
                        Here we compute the difference between the predicted probabilities of the extreme health outcomes.

                        Code:
                        . margins, expression(predict(pr outcome(5)) - predict(pr outcome(1)))
                        
                        Predictive margins                                Number of obs   =      10335
                        Model VCE    : OIM
                        
                        Expression   : predict(pr outcome(5)) - predict(pr outcome(1))
                        
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                               _cons |   .1627442   .0048282    33.71   0.000     .1532812    .1722072
                        ------------------------------------------------------------------------------
                        And here is the the marginal effect of age on the above.

                        Code:
                        . margins, expression(predict(pr outcome(5)) - predict(pr outcome(1))) dydx(age)
                        
                        Average marginal effects                          Number of obs   =      10335
                        Model VCE    : OIM
                        
                        Expression   : predict(pr outcome(5)) - predict(pr outcome(1))
                        dy/dx w.r.t. : age
                        
                        ------------------------------------------------------------------------------
                                     |            Delta-method
                                     |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
                        -------------+----------------------------------------------------------------
                                 age |     -.0098   .0002511   -39.02   0.000    -.0102922   -.0093078
                        ------------------------------------------------------------------------------

                        Comment


                        • #13
                          Dear Jeff,

                          thank you very for your great help! Your example fully answers my first question! And more

                          However, let me rephrase my second question, I realize now that I have formulated it very poorly in the first place:

                          Is there a way to calculate first differences of predicted probabilities with margins for specific values of Age? More specifically, for the minimal to maximal values of a continous variable? (For age in this dataset at 20 and age at 74). In other words, I am interested in the change of the predicted probablitiy from age=20 to age=74.

                          I know how to get it with prchange or manually by substracting values from me minimum to maximum with margins. What I am really lacking is the respective confidence interval!

                          See the example below:

                          HTML Code:
                          . webuse nhanes2f
                          
                          .
                          . oprobit health female black age
                          
                          Iteration 0:   log likelihood = -15764.397 
                          Iteration 1:   log likelihood = -14912.902 
                          Iteration 2:   log likelihood = -14912.265 
                          Iteration 3:   log likelihood = -14912.265 
                          
                          Ordered probit regression                         Number of obs   =      10335
                                                                            LR chi2(3)      =    1704.26
                                                                            Prob > chi2     =     0.0000
                          Log likelihood = -14912.265                       Pseudo R2       =     0.0541
                          
                          ------------------------------------------------------------------------------
                                health |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                female |   -.064733   .0208928    -3.10   0.002    -.1056821   -.0237839
                                 black |  -.5149209   .0339591   -15.16   0.000    -.5814796   -.4483622
                                   age |  -.0244315   .0006292   -38.83   0.000    -.0256646   -.0231984
                          -------------+----------------------------------------------------------------
                                 /cut1 |  -2.872614   .0410632                     -2.953096   -2.792132
                                 /cut2 |  -2.055493   .0370311                     -2.128073   -1.982914
                                 /cut3 |  -1.196979   .0344675                     -1.264534   -1.129424
                                 /cut4 |  -.4459914   .0335569                     -.5117617   -.3802211
                          ------------------------------------------------------------------------------
                          
                          .
                          .
                          Now I get the predicted probabilty for the minimum and maximum value with margins:

                          HTML Code:
                          . margins, at(age=(20 74)) predict(outcome(1))  atmeans
                          
                          Adjusted predictions                              Number of obs   =      10335
                          Model VCE    : OIM
                          
                          Expression   : Pr(health==1), predict(outcome(1))
                          
                          1._at        : female          =    .5250121 (mean)
                                         black           =    .1050798 (mean)
                                         age             =          20
                          
                          2._at        : female          =    .5250121 (mean)
                                         black           =    .1050798 (mean)
                                         age             =          74
                          
                          ------------------------------------------------------------------------------
                                       |            Delta-method
                                       |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                   _at |
                                    1  |   .0108411   .0008266    13.11   0.000     .0092209    .0124612
                                    2  |   .1643857   .0058107    28.29   0.000      .152997    .1757745
                          ------------------------------------------------------------------------------
                          .
                          Now I can estimate the change in probability accross the range of age:

                          HTML Code:
                          . display .1643857 -  .0108411
                          .1535446
                          .
                          I could dereive the same estimate by relying on prchange and look at the collumn with Min-> Max :

                          HTML Code:
                          . prchange age
                          
                          oprobit: Changes in Probabilities for health
                          
                          age
                                      Avg|Chg|        poor        fair     average        good    excellen
                          Min->Max    .1960772   .15354468   .21350347   .12314484  -.11572406  -.37446894
                             -+1/2   .00389309   .00261393   .00443359   .00268519  -.00267902   -.0070537
                            -+sd/2   .06654138   .04554091   .07548538   .04532719  -.04522246  -.12113099
                          MargEfct   .00389318   .00261383   .00443374   .00268539   -.0026792  -.00705376
                          
                                        poor       fair    average       good   excellen
                          Pr(y|x)  .05235707  .15796733  .31089664   .2681399  .21063906
                          
                                  female    black      age
                             x=  .525012   .10508  47.5658
                          sd_x=  .499398  .306671  17.2175
                          
                          .
                          end of do-file
                          
                          .
                          What I am now still missing is the respective confidence interval for the Change in probabilty. Could you think of a way?

                          I believe it is more useful to compare the change in probability for the range of a continous variable rather to report the marginal effects - especially when we want to compare the size of the effects with categorical variables.

                          Since margins in fact does not return marginal effects for factor variables but first differences and reports the respective confidence interval, I am hoping there might be a similar way to get the CI as well for differences in predicted probabilites for the mimum and maximum values of continous variables.

                          Best,
                          Pablo

                          Comment


                          • #14
                            What you want is a contrast of the _at results. This can be done by adding
                            option contrast(atcontrast(r)). See help for margins contrast.

                            Code:
                            . margins, at(age=(20 74)) predict(outcome(1))  atmeans contrast(at effects)
                            
                            Contrasts of adjusted predictions
                            Model VCE    : OIM
                            
                            Expression   : Pr(health==1), predict(outcome(1))
                            
                            1._at        : female          =    .5250121 (mean)
                                           black           =    .1050798 (mean)
                                           age             =          20
                            
                            2._at        : female          =    .5250121 (mean)
                                           black           =    .1050798 (mean)
                                           age             =          74
                            
                            ------------------------------------------------
                                         |         df        chi2     P>chi2
                            -------------+----------------------------------
                                     _at |          1      718.68     0.0000
                            ------------------------------------------------
                            
                            ------------------------------------------------------------------------------
                                         |            Delta-method
                                         |   Contrast   Std. Err.      z    P>|z|     [95% Conf. Interval]
                            -------------+----------------------------------------------------------------
                                     _at |
                            (2 vs base)  |   .1535447   .0057275    26.81   0.000     .1423189    .1647704
                            ------------------------------------------------------------------------------

                            Comment


                            • #15
                              This is what I was looking for. Many thanks!

                              Comment

                              Working...
                              X