Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding the difference between predicted probabilities of the subpopulation of the outcome of a logit model.

    I am trying to conduct a difference-in-difference regression model for a binary outcome. So I came across this work Difference-in-differences with an ordinal dependent variable : assessing the impact of the London bombings on the safety perceptions of Muslims (whiterose.ac.uk) that implements Interaction terms in logit and probit models - ScienceDirect non-linear approach. the former implemented the latter to an ordinal outcome model using probit.

    Below is the code from the former (I do not have the data):
    Code:
    oprobit Y D T DT x1 x2, vce(robust)
    margins, dydx(DT) vce(unconditional) subpop(DT) post
    nlcom [DT]1. predict - [DT]2. predict
    I understand that I may not use this directly for a binary outcome since a binary outcome is not ordered but I tried to see if I could apply same using logit but there is problem in generating predicted probabilities for subpopulations of the binary outcome.

    Below is the code I tried with [0._predict] not found error. Your assistance will be much appreciated. Thanks. Attached is a sample data.
    Code:
    logit savemoney kd_did i.age i.period i.progexp [pw=kdweight] if state==2, vce(robust) 
    margin , dydx(kd_did) vce(unconditional) over(kd_did) post
    nlcom [kd_did]1._predict - [kd_did]2._predict
    With the following output:
    Code:
    . logit savemoney kd_did i.age i.period i.progexp [pw=kdweight] if state==2, vce(robust) 
    
    Iteration 0:   log pseudolikelihood = -701.54708  
    Iteration 1:   log pseudolikelihood =  -605.8859  
    Iteration 2:   log pseudolikelihood = -587.40382  
    Iteration 3:   log pseudolikelihood = -586.56099  
    Iteration 4:   log pseudolikelihood = -586.55551  
    Iteration 5:   log pseudolikelihood = -586.55551  
    
    Logistic regression                                     Number of obs =  1,168
                                                            Wald chi2(8)  =  41.97
                                                            Prob > chi2   = 0.0000
    Log pseudolikelihood = -586.55551                       Pseudo R2     = 0.1639
    
    -------------------------------------------------------------------------------------
                        |               Robust
              savemoney | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
    --------------------+----------------------------------------------------------------
                 kd_did |   3.167547   .6370738     4.97   0.000     1.918906    4.416189
                        |
                    age |
                    16  |   .6278054    .947813     0.66   0.508    -1.229874    2.485485
                    17  |    2.00357   .9929915     2.02   0.044     .0573424    3.949798
                    18  |    .867744   .8106324     1.07   0.284    -.7210664    2.456554
                    19  |   .7230419   .7988291     0.91   0.365    -.8426343    2.288718
                    20  |   1.230873   .9438616     1.30   0.192    -.6190615    3.080808
                        |
                 period |
               Endline  |  -.4932012   .2819289    -1.75   0.080    -1.045772    .0593693
                        |
                progexp |
    Intervention Group  |   .1317869   .3128276     0.42   0.674    -.4813439    .7449178
                  _cons |   .4604072   .8164822     0.56   0.573    -1.139869    2.060683
    -------------------------------------------------------------------------------------
    
    . margin , dydx(kd_did) vce(unconditional) over(kd_did) post // subpop(kd_did) post
    
    Average marginal effects                                 Number of obs = 1,168
    
    Expression: Pr(savemoney), predict()
    dy/dx wrt:  kd_did
    Over:       kd_did
    
    ------------------------------------------------------------------------------
                 |            Unconditional
                 |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
    -------------+----------------------------------------------------------------
    kd_did       |
          kd_did |
              0  |   .6109751   .1309218     4.67   0.000     .3543731    .8675772
              1  |   .0520365   .0187694     2.77   0.006     .0152492    .0888239
    ------------------------------------------------------------------------------
    
    . nlcom [kd_did]0._predict - [kd_did]1._predict
    
    [0._predict] not found
    Below is my data:
    Code:
    1 1 19 1 1     1.1519125 2
    1 1 18 1 1     1.2058938 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    1 0 19 0 1             1 2
    1 1 19 1 1      .9901584 2
    0 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 20 1 1 1.2315145e-08 2
    1 0 19 0 1             1 2
    1 1 19 1 1       .747017 2
    1 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 0 19 0 1             1 2
    1 0 19 0 1             1 2
    1 0 19 0 1             1 2
    1 1 19 1 1      1.002548 2
    1 1 19 1 1      .6983826 2
    1 0 19 0 1             1 2
    1 1 19 1 1      .9901584 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    1 1 19 1 1       .747017 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    . 0 17 0 1             1 2
    1 1 15 1 1     1.1977272 2
    . 0 19 0 1             1 2
    1 1 18 1 1      .9097768 2
    1 1 19 1 1     .58043873 2
    1 0 19 0 1             1 2
    1 1 19 1 1      .6983826 2
    1 0 18 0 1             1 2
    1 1 19 1 1      .9901584 2
    1 0 19 0 1             1 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    1 1 19 1 1      .6983826 2
    1 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .6983826 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    . 0 18 0 1             1 2
    1 1 19 1 1     .58043873 2
    . 0 18 0 1             1 2
    1 1 18 1 1      .9097768 2
    . 0 17 0 1             1 2
    . 0 19 0 1             1 2
    . 0 18 0 1             1 2
    1 1 19 1 1      .6983826 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.7102258 2
    . 0 18 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .7460096 2
    1 0 19 0 1             1 2
    1 1 20 1 1  7.219235e-09 2
    . 0 17 0 1             1 2
    1 1 17 1 1      .8503983 2
    1 1 19 1 1      .9901584 2
    . 0 16 0 1             1 2
    1 1 19 1 1       .747017 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    . 0 19 0 1             1 2
    1 1 19 1 1     .58043873 2
    . 0 19 0 1             1 2
    1 0 16 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .9901584 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .8950461 2
    . 0 17 0 1             1 2
    1 1 17 1 1 1.5221377e-07 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    1 1 19 1 1      .6983826 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .6983826 2
    . 0 19 0 1             1 2
    1 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    1 0 19 0 1             1 2
    1 1 19 1 1     1.1913565 2
    . 0 19 0 1             1 2
    . 0 19 0 1             1 2
    1 1 19 1 1      .8988092 2
    1 0 19 0 1             1 2
    1 1 20 1 1  9.316057e-09 2
    1 0 19 0 1             1 2
    1 1 20 1 1   6.00004e-09 2
    1 0 18 0 1             1 2
    end
    label values savemoney yesno2
    label def yesno2 0 "No", modify
    label def yesno2 1 "Yes", modify
    label values period period
    label def period 0 "Baseline", modify
    label def period 1 "Endline", modify
    label values progexp exposure
    label def exposure 1 "Intervention Group", modify
    label values state state
    label def state 2 "Kaduna", modify


  • #2
    If the question is how to refer to coefficients from margins, then note that the option -coeflegend- is allowed.

    Code:
    margins, dydx(kd_did) vce(unconditional) over(kd_did) post coeflegend

    Comment


    • #3
      Originally posted by Andrew Musau View Post
      If the question is how to refer to coefficients from margins, then note that the option -coeflegend- is allowed.

      Code:
      margins, dydx(kd_did) vce(unconditional) over(kd_did) post coeflegend
      Thank you. The coeflegend worked.

      Code:
      logit savemoney kd_did i.age i.period i.progexp [pw=kdweight] if state==2, vce(robust) 
      margins, dydx(kd_did) vce(unconditional) over(kd_did) post 
      nlcom _b[kd_did:0bn.kd_did] - _b[kd_did:1.kd_did]
      
       logit savemoney kd_did i.age i.period i.progexp [pw=kdweight] if state==2, vce(robust) 
      
      Iteration 0:   log pseudolikelihood = -701.54708  
      Iteration 1:   log pseudolikelihood =  -605.8859  
      Iteration 2:   log pseudolikelihood = -587.40382  
      Iteration 3:   log pseudolikelihood = -586.56099  
      Iteration 4:   log pseudolikelihood = -586.55551  
      Iteration 5:   log pseudolikelihood = -586.55551  
      
      Logistic regression                                     Number of obs =  1,168
                                                              Wald chi2(8)  =  41.97
                                                              Prob > chi2   = 0.0000
      Log pseudolikelihood = -586.55551                       Pseudo R2     = 0.1639
      
      -------------------------------------------------------------------------------------
                          |               Robust
                savemoney | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
      --------------------+----------------------------------------------------------------
                   kd_did |   3.167547   .6370738     4.97   0.000     1.918906    4.416189
                          |
                      age |
                      16  |   .6278054    .947813     0.66   0.508    -1.229874    2.485485
                      17  |    2.00357   .9929915     2.02   0.044     .0573424    3.949798
                      18  |    .867744   .8106324     1.07   0.284    -.7210664    2.456554
                      19  |   .7230419   .7988291     0.91   0.365    -.8426343    2.288718
                      20  |   1.230873   .9438616     1.30   0.192    -.6190615    3.080808
                          |
                   period |
                 Endline  |  -.4932012   .2819289    -1.75   0.080    -1.045772    .0593693
                          |
                  progexp |
      Intervention Group  |   .1317869   .3128276     0.42   0.674    -.4813439    .7449178
                    _cons |   .4604072   .8164822     0.56   0.573    -1.139869    2.060683
      -------------------------------------------------------------------------------------
      
      . margins, dydx(kd_did) vce(unconditional) over(kd_did) post 
      
      Average marginal effects                                 Number of obs = 1,168
      
      Expression: Pr(savemoney), predict()
      dy/dx wrt:  kd_did
      Over:       kd_did
      
      ------------------------------------------------------------------------------
                   |            Unconditional
                   |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
      kd_did       |
            kd_did |
                0  |   .6109751   .1309218     4.67   0.000     .3543731    .8675772
                1  |   .0520365   .0187694     2.77   0.006     .0152492    .0888239
      ------------------------------------------------------------------------------
      
      . nlcom _b[kd_did:0bn.kd_did] - _b[kd_did:1.kd_did]
      
             _nl_1: _b[kd_did:0bn.kd_did] - _b[kd_did:1.kd_did]
      
      ------------------------------------------------------------------------------
                   | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
             _nl_1 |   .5589386   .1421614     3.93   0.000     .2803073    .8375699
      ------------------------------------------------------------------------------
      but I am now wondering if the margins and nlcom line did a similar job as shown when I used ologit below. What I actually wanted are the predict values at savemoney=0 and at savemoney=1 as shown below with ologit but it seems that is not the case with the logit's result. Is it possible to have such with logit? Thanks.

      Code:
      ologit savemoney kd_did i.age i.highedulevel i.ethnic i.childnum i.period i.progexp [pw=kdweight] if state==2, vce(robust)
      margins, dydx(kd_did) vce(unconditional) subpop(kd_did) post
      nlcom [kd_did]1._predict - [kd_did]2._predict
      
      . ologit savemoney kd_did i.age i.highedulevel i.ethnic i.childnum i.period i.progexp [pw=kdweight] if state==2, vce(robust)
      
      Iteration 0:   log pseudolikelihood = -701.54708  
      Iteration 1:   log pseudolikelihood =  -596.1297  
      Iteration 2:   log pseudolikelihood = -575.52064  
      Iteration 3:   log pseudolikelihood = -574.51627  
      Iteration 4:   log pseudolikelihood = -574.50081  
      Iteration 5:   log pseudolikelihood = -574.49762  
      Iteration 6:   log pseudolikelihood =  -574.4969  
      Iteration 7:   log pseudolikelihood = -574.49672  
      Iteration 8:   log pseudolikelihood = -574.49668  
      Iteration 9:   log pseudolikelihood = -574.49668  
      
      Ordered logistic regression                             Number of obs =  1,168
                                                              Wald chi2(20) = 395.04
                                                              Prob > chi2   = 0.0000
      Log pseudolikelihood = -574.49668                       Pseudo R2     = 0.1811
      
      ------------------------------------------------------------------------------------------
                               |               Robust
                     savemoney | Coefficient  std. err.      z    P>|z|     [95% conf. interval]
      -------------------------+----------------------------------------------------------------
                        kd_did |   3.212439   .6423242     5.00   0.000     1.953507    4.471371
                               |
                           age |
                           16  |    .547495   .9666026     0.57   0.571    -1.347011    2.442001
                           17  |   1.742254   1.041358     1.67   0.094    -.2987712    3.783279
                           18  |   .5793496   .8698062     0.67   0.505    -1.125439    2.284138
                           19  |    .473355   .8706683     0.54   0.587    -1.233124    2.179834
                           20  |   .9644165   1.006985     0.96   0.338    -1.009237     2.93807
                               |
                  highedulevel |
                   Islamiyyah  |  -.1427443   .6020627    -0.24   0.813    -1.322765    1.037277
                      Primary  |   .2401711   .6220503     0.39   0.699     -.979025    1.459367
      Junior Secondary School  |   .3426247   .5975378     0.57   0.566    -.8285279    1.513777
      Senior Secondary School  |   .8186172   .5759525     1.42   0.155    -.3102289    1.947463
              Above secondary  |     .31397   .6309992     0.50   0.619    -.9227657    1.550706
                               |
                        ethnic |
                        Hausa  |  -11.57713    .974074   -11.89   0.000    -13.48628   -9.667984
                       Fulani  |  -11.99326   1.048266   -11.44   0.000    -14.04783   -9.938699
                         Igbo  |   1.492786   1.497149     1.00   0.319    -1.441572    4.427144
                       Others  |  -12.00018     1.0283   -11.67   0.000    -14.01561   -9.984751
                               |
                      childnum |
                    One Child  |   .4441028   .3680415     1.21   0.228    -.2772453    1.165451
                 Two Children  |    .159522   .3825836     0.42   0.677    -.5903281    .9093721
       Three or more Children  |   .0907047   .4039654     0.22   0.822     -.701053    .8824624
                               |
                        period |
                      Endline  |  -.4846891   .2863782    -1.69   0.091     -1.04598    .0766019
                               |
                       progexp |
           Intervention Group  |   .1447468   .3206002     0.45   0.652     -.483618    .7731117
      -------------------------+----------------------------------------------------------------
                         /cut1 |  -11.46735   1.390522                     -14.19272   -8.741976
      ------------------------------------------------------------------------------------------
      Note: 2 observations completely determined. Standard errors questionable.
      
      . 
      . margins, dydx(kd_did) vce(unconditional) subpop(kd_did) post
      
      Average marginal effects                               Number of obs   = 1,168
                                                             Subpop. no. obs =   474
      
      dy/dx wrt: kd_did
      
      1._predict: Pr(savemoney==0), predict(pr outcome(0))
      2._predict: Pr(savemoney==1), predict(pr outcome(1))
      
      ------------------------------------------------------------------------------
                   |            Unconditional
                   |      dy/dx   std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
      kd_did       |
          _predict |
                1  |  -.0526479   .0191132    -2.75   0.006    -.0901091   -.0151867
                2  |   .0526479   .0191132     2.75   0.006     .0151867    .0901091
      ------------------------------------------------------------------------------
      
      . 
      . nlcom [kd_did]1._predict - [kd_did]2._predict
      
             _nl_1: [kd_did]1._predict - [kd_did]2._predict
      
      ------------------------------------------------------------------------------
                   | Coefficient  Std. err.      z    P>|z|     [95% conf. interval]
      -------------+----------------------------------------------------------------
             _nl_1 |  -.1052958   .0382265    -2.75   0.006    -.1802183   -.0303733
      ------------------------------------------------------------------------------

      Comment


      • #4
        What I actually wanted are the predict values at savemoney=0 and at savemoney=1 as shown below with ologit but it seems that is not the case with the logit's result.
        In both the -logit- and -ologit- cases what -margins- is giving you is the marginal effect of variable kd_did on the probability of savemoney, not the expected values of savemoney.

        To get the expected values, the -margins- command should be:
        Code:
        margins, vce(unconditional) subpop(kd_did) post
        By the way, the use of -subpop(kd_did)- means that your -margins- calculations are restricted to the observations for which variable kd_did is non-missing and also non-zero. You will be excluding any observations in which kd_did == 0 or missing value from the calculation. That's perfectly fine if that's what you intend, but it isn't something commonly done. So just wanted to make sure it wasn't a misunderstanding.
        Last edited by Clyde Schechter; 15 Jun 2023, 22:44.

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          By the way, the use of -subpop(kd_did)- means that your -margins- calculations are restricted to the observations for which variable kd_did is non-missing and also non-zero. You will be excluding any observations in which kd_did == 0 or missing value from the calculation. That's perfectly fine if that's what you intend, but it isn't something commonly done. So just wanted to make sure it wasn't a misunderstanding.
          Yes, I am more interested in the -subpop(kd_did) but below is what I get instead of having something similar to the -ologit- results.

          Code:
           margins, vce(unconditional) subpop(kd_did) post coeflegend
          
          Predictive margins                                     Number of obs   = 1,168
                                                                 Subpop. no. obs =   474
          
          Expression: Pr(savemoney), predict()
          
          ------------------------------------------------------------------------------
                       |     Margin   Legend
          -------------+----------------------------------------------------------------
                 _cons |    .983281  _b[_cons]
          ------------------------------------------------------------------------------

          Comment


          • #6
            But, remember, as I pointed out in #4, what you got from -ologit- is not what you say you want. Had you done the -ologit- analysis correctly, it would have looked like this.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              But, remember, as I pointed out in #4, what you got from -ologit- is not what you say you want. Had you done the -ologit- analysis correctly, it would have looked like this.
              Perhaps, I am not putting my words correctly. Pardon me. What I want is something similar to what I have from -ologit-. I am trying to replicate what was done using -ologit- with -logit-. thank you.

              My main goal is to achieve what was stated in https://www.sciencedirect.com/scienc...65176503000326, i.e. "The interaction effect, which is often the variable of interest in applied econometrics, cannot be evaluated simply by looking at the sign, magnitude, or statistical significance of the coefficient on the interaction term when the model is nonlinear. Instead, the interaction effect requires computing the cross derivative or cross difference. Like the marginal effect of a single variable, the magnitude of the interaction effect depends on all the covariates in the model. In addition, it can have ..."

              PS: Can a binary outcome of yes (1) and no(0) be considered ordinal in the context of an intervention study?
              Last edited by Kehinde Atoloye; 18 Jun 2023, 07:01.

              Comment


              • #8
                OK, the difference between the -margins- results for -ologit- and -logit- arises because with -ologit- you specified -subpop(kd_did)-, but with -logit- you specified -over(kd_did)-.

                I believe what you actually want for both is the -over(kd_did)- specification. (Actually, I think you would be better off doing -margins kd_did, dydx(kd_did) vce(unconditional) post- for both, but I won't press that point.)

                Comment

                Working...
                X