Hi Statalisters,
I am a novice user in Stata. I'm working with Stata.14 and Windows 7.
I'm working on a Panel Data Set for all commerical banks in the U.S. for the period 1995 - 2018 (time variable). So I have data on a bank-year level. I created the ID Variable with the variables bank name and cert I already calculated four bank risk proxies: Z-Score, NPA (non-performing assets), LLP (loan loss provisions) and LLR (loan loss reserves) on a bank-year level.
I calculated the Risk Proxy Z_score and I would like to run the binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the risk proxy (lagged by one year).
I did this command to get for "Failure" = 1 and for "Active" = 0 for my binary outcome variable.
This is the dataset with 172 431 observations:
I have Panel Data, so I started with this commands to run the probit regression.[ I forgot to add the ,vce (cluster id) and I think the cformat(%09.0g) pformat(%05.0g) sformat(%08.0g) is irrelevant]
The binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the Z_score (lagged by one year).
This is the regression result:
Question1: It took a long time to receive the estimation results. Well, I'm working with Stata.14 and Windows 7 and with 172 431 observations, but is there a code to run it quicker?
Question2: In my "Guiding Paper" they assess the Z-Score Model on its Pseudo R2. I know that the "normal" probit regeression Output shows me the Pseudo R2 and there is a way to calculate the Pseudo R2 for the xtprobit Panel Data Probit Regression. I know that the pseudo R2 is stored with e(r2 p) and I got to this calculation https://www.stata.com/support/faqs/s...ics/r-squared/
Unfortunatly I can't get it together to calculate the Pseude R2 for my xtprobit case.
Concern: The regression results are far away from them in my Guiding Paper and I think I did something wrong ... Maybe with the binary dependent variable status?
Thank you very much for your support!
I am a novice user in Stata. I'm working with Stata.14 and Windows 7.
I'm working on a Panel Data Set for all commerical banks in the U.S. for the period 1995 - 2018 (time variable). So I have data on a bank-year level. I created the ID Variable with the variables bank name and cert I already calculated four bank risk proxies: Z-Score, NPA (non-performing assets), LLP (loan loss provisions) and LLR (loan loss reserves) on a bank-year level.
I calculated the Risk Proxy Z_score and I would like to run the binary probability model explaining the occurrence of a bank failure ( Failure = 1, Active = 0) with the risk proxy (lagged by one year).
I did this command to get for "Failure" = 1 and for "Active" = 0 for my binary outcome variable.
Code:
merge m:1 cert using `dataset1', assert(match master)
Result # of obs.
-----------------------------------------
not matched 674,977
from master 674,977 (_merge==1)
from using 0 (_merge==2)
matched 23,238 (_merge==3)
-----------------------------------------
. gen byte status = (_merge == 3)
. label define status 0 "Active" 1 "Failed"
. label values status status
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input float(id year status Z_score) 7 1995 1 -1.4038005 10 1995 0 -1.434213 11 1995 0 -1.5771302 14 1995 0 -1.758422 16 1995 0 -1.4295077 21 1995 0 -1.329172 27 1995 0 -1.3730284 32 1995 0 -1.3627455 34 1995 0 -1.908463 38 1995 0 -1.8048723 41 1995 0 -1.5905398 46 1995 0 -1.533159 47 1995 0 -1.663955 48 1995 0 -1.518417 52 1995 0 -1.3701818 53 1995 0 -1.485621 56 1995 0 -1.63249 59 1995 0 -1.476241 76 1995 0 -1.3577138 82 1995 0 -1.3661845 84 1995 0 -1.3885205 85 1995 0 -1.5949416 87 1995 0 -2.0597448 99 1995 0 -2.2821965 101 1995 0 -1.5937258 104 1995 0 -1.6237373 end format %ty year
Code:
xtset id year, yearly
panel variable: id (unbalanced)
time variable: year, 1995 to 2018, but with gaps
delta: 1 year
Code:
xtprobit status Z_score L.year, re
Code:
Fitting comparison model:
Iteration 0: log likelihood = -22279.067
Iteration 1: log likelihood = -20346.952
Iteration 2: log likelihood = -20275.616
Iteration 3: log likelihood = -20275.347
Iteration 4: log likelihood = -20275.347
Fitting full model:
rho = 0.0 log likelihood = -20275.347
rho = 0.1 log likelihood = -14462.788
rho = 0.2 log likelihood = -12201.376
rho = 0.3 log likelihood = -10943.278
rho = 0.4 log likelihood = -10144.299
rho = 0.5 log likelihood = -9623.8447
rho = 0.6 log likelihood = -9287.5278
rho = 0.7 log likelihood = -9127.5396
rho = 0.8 log likelihood = -9205.7192
Iteration 0: log likelihood = -9047.5173
Iteration 1: log likelihood = -7511.1642
Iteration 2: log likelihood = -4988.8547
Iteration 3: log likelihood = -4534.5219
Iteration 4: log likelihood = -3701.1825
Iteration 5: log likelihood = -3659.677 (not concave)
Iteration 6: log likelihood = -3595.4789
Iteration 7: log likelihood = -3595.4789 (backed up)
Iteration 8: log likelihood = -3564.2206
Iteration 9: log likelihood = -3557.6508
Iteration 10: log likelihood = -3557.6302
Iteration 11: log likelihood = -3557.6302
Random-effects probit regression Number of obs = 156,147
Group variable: id Number of groups = 14,692
Random effects u_i ~ Gaussian Obs per group:
min = 1
avg = 10.6
max = 23
Integration method: mvaghermite Integration pts. = 12
Wald chi2(2) = 111.93
Log likelihood = -3557.6302 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
status | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Z_score | .8909937 .0877437 10.15 0.000 .7190193 1.062968
|
year |
L1. | -.0197096 .0049606 -3.97 0.000 -.0294322 -.0099869
|
_cons | 34.67511 9.94973 3.49 0.000 15.17399 54.17622
-------------+----------------------------------------------------------------
/lnsig2u | 2.519077 .0220374 2.475884 2.562269
-------------+----------------------------------------------------------------
sigma_u | 3.523794 .0388276 3.448509 3.600723
rho | .9254684 .0015201 .9224338 .9283934
------------------------------------------------------------------------------
LR test of rho=0: chibar2(01) = 3.3e+04 Prob >= chibar2 = 0.000
Question2: In my "Guiding Paper" they assess the Z-Score Model on its Pseudo R2. I know that the "normal" probit regeression Output shows me the Pseudo R2 and there is a way to calculate the Pseudo R2 for the xtprobit Panel Data Probit Regression. I know that the pseudo R2 is stored with e(r2 p) and I got to this calculation https://www.stata.com/support/faqs/s...ics/r-squared/
Code:
regress weight length predict weightp if e(sample) corr weight weightp if e(sample) di r(rho)^2
Concern: The regression results are far away from them in my Guiding Paper and I think I did something wrong ... Maybe with the binary dependent variable status?
Thank you very much for your support!

Comment