Report predicted probability from Probit model using outreg2

Pedro Coelho

Join Date: Jan 2017
Posts: 20

Report predicted probability from Probit model using outreg2

27 Aug 2019, 06:53

Hi everyone,

I'm using outreg2 to export my probit results to excel.

Here's the command and output:

Code:

preserve
keep if period>=0 & period<=3
dprobit d_earnout DA_pa abn_cfo disc_exp abn_prod_ta deal_size_w deal_size2_w merger acq_assets pre07 t_public roa_lag_w log_ta_w crossborder leverage_lag_w related2 mtb_lag_w, robust
predict p1
outreg2 d_earnout period abn_cfo disc_exp abn_prod_ta deal_size_w deal_size2_w merger acq_assets pre07 t_public roa_lag_w log_ta_w crossborder leverage_lag_w related2 mtb_lag_w using probit.xls, drop(_I* d_earnout) margin addstat(Pseudo R-squared, e(r2_p), Actual Prob., e(pbar), Predicted Prob., e(p)) addtext(Year FE, No, Industry FE, No) stats(coef tstat) adec(4) bdec(4) tdec(2) coefastr append

Probit regression, reporting marginal effects           Number of obs =   8161
                                                        Wald chi2(16) = 352.57
                                                        Prob > chi2   = 0.0000
Log pseudolikelihood = -2941.1377                       Pseudo R2     = 0.0639

------------------------------------------------------------------------------
         |               Robust
d_earn~t |      dF/dx   Std. Err.      z    P>|z|     x-bar  [    95% C.I.   ]
---------+--------------------------------------------------------------------
   DA_pa |  -1.17e-07   .0000137    -0.01   0.993  -5.67065  -.000027  .000027
 abn_cfo |  -.0001204   .0000425    -2.82   0.005   1.13782  -.000204 -.000037
disc_exp |  -.0001872   .0001171    -1.60   0.111   2.01864  -.000417  .000042
abn_pr~a |   .0001522   .0000617     2.46   0.014   3.67435   .000031  .000273
deal~e_w |  -.1224253   .2401343    -0.51   0.610   .002953   -.59308  .348229
deal~2_w |  -.0021387   .0011207    -1.89   0.058   1.58287  -.004335  .000058
  merger*|   .0604825   .0187714     3.37   0.001   .381326   .023691  .097274
acq_as~s*|   .0354187   .0161831     2.16   0.031   .563044     .0037  .067137
   pre07*|  -.0576925   .0069305    -8.29   0.000   .451905  -.071276 -.044109
t_public*|   -.114349   .0059784    -9.88   0.000   .114692  -.126067 -.102632
roa_la~w |   .0087199    .005109     1.69   0.090  -.305627  -.001294  .018733
log_ta_w |  -.0149636   .0016234    -9.17   0.000   12.7287  -.018145 -.011782
crossb~r*|  -.0328448   .0076056    -4.00   0.000   .209411  -.047752 -.017938
leve~g_w |   -.065199   .0154945    -4.12   0.000   .272224  -.095568  -.03483
related2*|   .0072254   .0067743     1.06   0.288    .57493  -.006052  .020503
mtb_la~w |  -.3315524   .3258843    -1.02   0.310   .002755  -.970274  .307169
---------+--------------------------------------------------------------------
  obs. P |   .1292734
 pred. P |   .1060404  (at x-bar)
------------------------------------------------------------------------------
(*) dF/dx is for discrete change of dummy variable from 0 to 1
    z and P>|z| correspond to the test of the underlying coefficient being

Although -predic. P- is

.1060404, in the excel table it appears as 0.0000.

I thank you in advance for any help you can give.

Best Regards,

Pedro
(StataMP 16 user)

Tags: None

Marinka Willemsen

Join Date: Nov 2019

Posts: 9
#2

25 Jan 2022, 05:44

Dear Pedro, did you solve this?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10216
#3

25 Jan 2022, 07:06

Old post using an outdated command and no reproducible example. outreg2 is from SSC (FAQ Advice #12). Use probit and margins which supersede dprobit.
Comment
Øyvind Snilsberg

Join Date: Oct 2021

Posts: 591
#4

25 Jan 2022, 07:34

apparently e(p) (the chi2 test of significance of the regression) was mistaken for the predicted probability
Comment
Marinka Willemsen

Join Date: Nov 2019

Posts: 9
#5

27 Jan 2022, 02:09

Indeed Øyvind, I see. Thank you.

And thank you for the suggestion Andrew.
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10216

27 Jan 2022, 03:10

Code:

webuse lbw, clear
xi: dprobit low age i.smoke i.ui i.race

Using probit and margins:

Code:

webuse lbw, clear
probit low age i.smoke i.ui i.race
est sto low
sum low if e(sample)
local obsp= r(mean)
local r2_p= e(r2_p)
margins, atmeans post
local predp= _b[_cons]
est restore low
margins, dydx(*) atmeans post
outreg2 using myfile.xls, replace addstat(Pseudo R-squared, `r2_p', Actual Prob., `obsp', Predicted Prob., `predp')

Res.:

Code:

. xi: dprobit low age i.smoke i.ui i.race
i.smoke           _Ismoke_0-1         (naturally coded; _Ismoke_0 omitted)
i.ui              _Iui_0-1            (naturally coded; _Iui_0 omitted)
i.race            _Irace_1-3          (naturally coded; _Irace_1 omitted)

Iteration 0:   log likelihood =   -117.336
Iteration 1:   log likelihood = -107.42754
Iteration 2:   log likelihood = -107.29099
Iteration 3:   log likelihood = -107.29085

Probit regression, reporting marginal effects           Number of obs =    189
                                                        LR chi2(5)    =  20.09
                                                        Prob > chi2   = 0.0012
Log likelihood = -107.29085                             Pseudo R2     = 0.0856

------------------------------------------------------------------------------
     low |      dF/dx   Std. Err.      z    P>|z|     x-bar  [    95% C.I.   ]
---------+--------------------------------------------------------------------
     age |  -.0066585   .0069625    -0.95   0.340   23.2381  -.020305  .006988
_Ismok~1*|   .2310636   .0782544     2.94   0.003   .391534   .077688  .384439
  _Iui_1*|   .1940921   .1032234     1.96   0.050   .148148  -.008222  .396406
_Irace_2*|   .2383106   .1168187     2.11   0.035   .137566    .00935  .467271
_Irace_3*|   .2235834   .0868606     2.59   0.010   .354497    .05334  .393827
---------+--------------------------------------------------------------------
  obs. P |   .3121693
 pred. P |   .2938575  (at x-bar)
------------------------------------------------------------------------------
(*) dF/dx is for discrete change of dummy variable from 0 to 1
    z and P>|z| correspond to the test of the underlying coefficient being 0

Code:

 
(1)

VARIABLES
y1




age
-0.00666


(0.00696)

1.smoke
0.231***


(0.0783)

1.ui
0.194*


(0.103)

2.race
0.218**


(0.109)

3.race
0.214***


(0.0825)




Observations
189

Pseudo R-squared
0.0856

Actual Prob.
0.312

Predicted Prob.
0.294

Standard errors in parentheses

*** p<0.01, ** p<0.05, * p<0.1

Note that the way dprobit used to calculate marginal effects for factor variables differs from margins. That is why it is important to use the most up to date commands.

Comment

Marinka Willemsen

Join Date: Nov 2019

Posts: 9
#7

27 Jan 2022, 03:36

Hi Andrew, how great, thanks a lot for sharing your code.

I was actually hoping you could explain what the following does

Code:

est to low sum low if e(sample)

and

Code:

est restore low margins dydx(8) atmeans post

And what's the difference here between actual probability and predicted probability?
Comment
Andrew Musau

Join Date: Oct 2014

Posts: 10216
#8

27 Jan 2022, 03:52

The actual probability is the average of the outcome which is binary (0/1).

est sto low
sum low if e(sample)

So I am first storing the estimates from probit as margins will overwrite them and telling Stata to summarize the outcome for all observations in the sample so as to pick out the mean. In binary dependent variable models, \(\hat{y}_i= p_i= Pr(y_i=1|x_i)\) is the probability of a subject making a positive response. So the predicted probability is the average of the predicted outcome. I could also have done the following, but margins is faster.

Code:

webuse lbw, clear probit low age i.smoke i.ui i.race predict lowhat, pr sum lowhat

Code:

predict lowhat, pr . . sum lowhat Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- lowhat | 189 .3113034 .1508968 .0539428 .7709825

Note however that the predicted probability using

Code:

margins, atmeans

is calculated at the mean values of the covariates, which differs from the simple average that I have above. The average is given by

Code:

webuse lbw, clear probit low age i.smoke i.ui i.race margins

Last edited by Andrew Musau; 27 Jan 2022, 04:16.
Comment
Marinka Willemsen

Join Date: Nov 2019

Posts: 9
#9

01 Feb 2022, 03:39

Dear Andrew, thank you for this! Brilliant, this helps a lot.
Comment

Announcement

Report predicted probability from Probit model using outreg2

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment