Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Report predicted probability from Probit model using outreg2

    Hi everyone,

    I'm using outreg2 to export my probit results to excel.

    Here's the command and output:
    Code:
    preserve
    keep if period>=0 & period<=3
    dprobit d_earnout DA_pa abn_cfo disc_exp abn_prod_ta deal_size_w deal_size2_w merger acq_assets pre07 t_public roa_lag_w log_ta_w crossborder leverage_lag_w related2 mtb_lag_w, robust
    predict p1
    outreg2 d_earnout period abn_cfo disc_exp abn_prod_ta deal_size_w deal_size2_w merger acq_assets pre07 t_public roa_lag_w log_ta_w crossborder leverage_lag_w related2 mtb_lag_w using probit.xls, drop(_I* d_earnout) margin addstat(Pseudo R-squared, e(r2_p), Actual Prob., e(pbar), Predicted Prob., e(p)) addtext(Year FE, No, Industry FE, No) stats(coef tstat) adec(4) bdec(4) tdec(2) coefastr append
    
    Probit regression, reporting marginal effects           Number of obs =   8161
                                                            Wald chi2(16) = 352.57
                                                            Prob > chi2   = 0.0000
    Log pseudolikelihood = -2941.1377                       Pseudo R2     = 0.0639
    
    ------------------------------------------------------------------------------
             |               Robust
    d_earn~t |      dF/dx   Std. Err.      z    P>|z|     x-bar  [    95% C.I.   ]
    ---------+--------------------------------------------------------------------
       DA_pa |  -1.17e-07   .0000137    -0.01   0.993  -5.67065  -.000027  .000027
     abn_cfo |  -.0001204   .0000425    -2.82   0.005   1.13782  -.000204 -.000037
    disc_exp |  -.0001872   .0001171    -1.60   0.111   2.01864  -.000417  .000042
    abn_pr~a |   .0001522   .0000617     2.46   0.014   3.67435   .000031  .000273
    deal~e_w |  -.1224253   .2401343    -0.51   0.610   .002953   -.59308  .348229
    deal~2_w |  -.0021387   .0011207    -1.89   0.058   1.58287  -.004335  .000058
      merger*|   .0604825   .0187714     3.37   0.001   .381326   .023691  .097274
    acq_as~s*|   .0354187   .0161831     2.16   0.031   .563044     .0037  .067137
       pre07*|  -.0576925   .0069305    -8.29   0.000   .451905  -.071276 -.044109
    t_public*|   -.114349   .0059784    -9.88   0.000   .114692  -.126067 -.102632
    roa_la~w |   .0087199    .005109     1.69   0.090  -.305627  -.001294  .018733
    log_ta_w |  -.0149636   .0016234    -9.17   0.000   12.7287  -.018145 -.011782
    crossb~r*|  -.0328448   .0076056    -4.00   0.000   .209411  -.047752 -.017938
    leve~g_w |   -.065199   .0154945    -4.12   0.000   .272224  -.095568  -.03483
    related2*|   .0072254   .0067743     1.06   0.288    .57493  -.006052  .020503
    mtb_la~w |  -.3315524   .3258843    -1.02   0.310   .002755  -.970274  .307169
    ---------+--------------------------------------------------------------------
      obs. P |   .1292734
     pred. P |   .1060404  (at x-bar)
    ------------------------------------------------------------------------------
    (*) dF/dx is for discrete change of dummy variable from 0 to 1
        z and P>|z| correspond to the test of the underlying coefficient being
    Although -predic. P- is
    .1060404, in the excel table it appears as 0.0000.

    I thank you in advance for any help you can give.
    Best Regards,

    Pedro
    (StataMP 16 user)

  • #2
    Dear Pedro, did you solve this?

    Comment


    • #3
      Old post using an outdated command and no reproducible example. outreg2 is from SSC (FAQ Advice #12). Use probit and margins which supersede dprobit.

      Comment


      • #4
        apparently e(p) (the chi2 test of significance of the regression) was mistaken for the predicted probability

        Comment


        • #5
          Indeed Øyvind, I see. Thank you.

          And thank you for the suggestion Andrew.

          Comment


          • #6
            Code:
            webuse lbw, clear
            xi: dprobit low age i.smoke i.ui i.race
            Using probit and margins:

            Code:
            webuse lbw, clear
            probit low age i.smoke i.ui i.race
            est sto low
            sum low if e(sample)
            local obsp= r(mean)
            local r2_p= e(r2_p)
            margins, atmeans post
            local predp= _b[_cons]
            est restore low
            margins, dydx(*) atmeans post
            outreg2 using myfile.xls, replace addstat(Pseudo R-squared, `r2_p', Actual Prob., `obsp', Predicted Prob., `predp')
            Res.:

            Code:
            . xi: dprobit low age i.smoke i.ui i.race
            i.smoke           _Ismoke_0-1         (naturally coded; _Ismoke_0 omitted)
            i.ui              _Iui_0-1            (naturally coded; _Iui_0 omitted)
            i.race            _Irace_1-3          (naturally coded; _Irace_1 omitted)
            
            Iteration 0:   log likelihood =   -117.336
            Iteration 1:   log likelihood = -107.42754
            Iteration 2:   log likelihood = -107.29099
            Iteration 3:   log likelihood = -107.29085
            
            Probit regression, reporting marginal effects           Number of obs =    189
                                                                    LR chi2(5)    =  20.09
                                                                    Prob > chi2   = 0.0012
            Log likelihood = -107.29085                             Pseudo R2     = 0.0856
            
            ------------------------------------------------------------------------------
                 low |      dF/dx   Std. Err.      z    P>|z|     x-bar  [    95% C.I.   ]
            ---------+--------------------------------------------------------------------
                 age |  -.0066585   .0069625    -0.95   0.340   23.2381  -.020305  .006988
            _Ismok~1*|   .2310636   .0782544     2.94   0.003   .391534   .077688  .384439
              _Iui_1*|   .1940921   .1032234     1.96   0.050   .148148  -.008222  .396406
            _Irace_2*|   .2383106   .1168187     2.11   0.035   .137566    .00935  .467271
            _Irace_3*|   .2235834   .0868606     2.59   0.010   .354497    .05334  .393827
            ---------+--------------------------------------------------------------------
              obs. P |   .3121693
             pred. P |   .2938575  (at x-bar)
            ------------------------------------------------------------------------------
            (*) dF/dx is for discrete change of dummy variable from 0 to 1
                z and P>|z| correspond to the test of the underlying coefficient being 0
            Code:
             
            (1)
            VARIABLES y1
            age -0.00666
            (0.00696)
            1.smoke 0.231***
            (0.0783)
            1.ui 0.194*
            (0.103)
            2.race 0.218**
            (0.109)
            3.race 0.214***
            (0.0825)
            Observations 189
            Pseudo R-squared 0.0856
            Actual Prob. 0.312
            Predicted Prob. 0.294
            Standard errors in parentheses
            *** p<0.01, ** p<0.05, * p<0.1
            Note that the way dprobit used to calculate marginal effects for factor variables differs from margins. That is why it is important to use the most up to date commands.

            Comment


            • #7
              Hi Andrew, how great, thanks a lot for sharing your code.

              I was actually hoping you could explain what the following does

              Code:
              est to low
              sum low if e(sample)
              and

              Code:
              est restore low
              margins dydx(8) atmeans post
              And what's the difference here between actual probability and predicted probability?

              Comment


              • #8
                The actual probability is the average of the outcome which is binary (0/1).

                est sto low
                sum low if e(sample)
                So I am first storing the estimates from probit as margins will overwrite them and telling Stata to summarize the outcome for all observations in the sample so as to pick out the mean. In binary dependent variable models, \(\hat{y}_i= p_i= Pr(y_i=1|x_i)\) is the probability of a subject making a positive response. So the predicted probability is the average of the predicted outcome. I could also have done the following, but margins is faster.

                Code:
                webuse lbw, clear
                probit low age i.smoke i.ui i.race
                predict lowhat, pr
                sum lowhat

                Code:
                 predict lowhat, pr
                
                .
                . sum lowhat
                
                    Variable |        Obs        Mean    Std. Dev.       Min        Max
                -------------+---------------------------------------------------------
                      lowhat |        189    .3113034    .1508968   .0539428   .7709825
                Note however that the predicted probability using

                Code:
                margins, atmeans
                is calculated at the mean values of the covariates, which differs from the simple average that I have above. The average is given by

                Code:
                webuse lbw, clear
                probit low age i.smoke i.ui i.race
                margins
                Last edited by Andrew Musau; 27 Jan 2022, 04:16.

                Comment


                • #9
                  Dear Andrew, thank you for this! Brilliant, this helps a lot.

                  Comment

                  Working...
                  X