Box cox double hurdle model

Daryosh Jan

Join Date: Jul 2016

Posts: 22
#1

Box cox double hurdle model

16 Aug 2016, 16:35

Dear Stata users,

I am running the below code to get box-cox double hurdle model( I got the codes from the paper published and told my stata user). I got the following message
invalid evaluator type
r(3498);

Could you please help me how to use this problem?

global listy x1 x2 x3 x4 x5
global listd x1 x2 x3
program define d_h
version 14
args Inf thetal theta2 theta3 theta4
tempvar d p z p0 pll yt
quietly gen double `d'= $ML_yl >0
quietly gen double `p'= normprob(`theta3')
quietly gen double `1'= `theta4'
quietly gen double `yt'= ($MLyl^`I'-1)/`l'
quietly gen double `z'= (`yt'-`thetal ')/(`theta2')
quietly gen double `pO'= 1-(`p'*normprob(-`z'))
quietly gen double `pl' = (($ML_yl + (1-`d'))^ (`T'-1))*`p'*normd(`z')/`theta2'
quietly replace `lnf' = ln((1-`d')*`p0' + `d'*`pl')
end
ml model If d_h (yt=`listy') () (yt=`listd') ()
ml init b, copy
ml maximize
Tags: None

Christophe Kolodziejczyk

Join Date: Mar 2014
Posts: 377

17 Aug 2016, 02:15

You should write lf (starting with an l) instead of If (starting with capital i). You will be able to see the difference if you use the code style (# button). Btw drop the ml init line.

Code:

global listy x1 x2 x3 x4 x5
global listd x1 x2 x3 
program define d_h
version 14
args Inf thetal theta2 theta3 theta4
tempvar d p z p0 pll yt 
quietly gen double `d'= $ML_yl >0 
quietly gen double `p'= normprob(`theta3') 
quietly gen double `1'= `theta4' 
quietly gen double `yt'= ($MLyl^`I'-1)/`l' 
quietly gen double `z'= (`yt'-`thetal ')/(`theta2') 
quietly gen double `pO'= 1-(`p'*normprob(-`z')) 
quietly gen double `pl' = (($ML_yl + (1-`d'))^ (`T'-1))*`p'*normd(`z')/`theta2' 
quietly replace `lnf' = ln((1-`d')*`p0' + `d'*`pl') 
end 
ml model If d_h (yt=`listy') () (yt=`listd') () 
ml init b, copy
ml maximize

Comment

Christophe Kolodziejczyk

Join Date: Mar 2014

Posts: 377
#3

17 Aug 2016, 02:17

and use ml trace on before ml maximize
Comment
Daryosh Jan

Join Date: Jul 2016

Posts: 22
#4

17 Aug 2016, 23:02

Thank you Christophe. Here, I implemented everything as you suggested. I got the following error:

global listy educ cigpric white age agesq income

. global listd educ white age agesq

. program define d_h
1. version 14
2. args Inf thetal theta2 theta3 theta4
3. tempvar d p z p0 pll yt
4. quietly gen double `d'= $ML_yl >0
5. quietly gen double `p'= normprob(`theta3')
6. quietly gen double `1'= `theta4'
7. quietly gen double `yt'= ($MLyl^`I'-1)/`l'
8. quietly gen double `z'= (`yt'-`thetal ')/(`theta2')
9. quietly gen double `pO'= 1-(`p'*normprob(-`z'))
10. quietly gen double `pl' = (($ML_yl + (1-`d'))^ (`T'-1))*`p'*normd(`z')/`theta2'
11. quietly replace `lnf' = ln((1-`d')*`p0' + `d'*`pl')
12. end

. ml model lf d_h (cigs=`listy') () (eversmoked=`listd') ()

. ml trace on

. ml maximize

-> d_h __000008 __000009 __00000A __00000B __00000C
- `begin'
= capture noisily version 14.1: d_h __000008 __000009 __00000A __00000B __00000C
---------------------------------------------------------------------------------------------------------------- begin d_h ---
- version 14
- args Inf thetal theta2 theta3 theta4
- tempvar d p z p0 pll yt
- quietly gen double `d'= $ML_yl >0
= quietly gen double __00000D= >0
>0 invalid name
------------------------------------------------------------------------------------------------------------------ end d_h ---
- `end'
= set trace off
r(198);

end of do-file

r(198);

For you information I used Woldridge data set for SMOKE.DTA (http://qcpages.qc.cuny.edu/~rvesselinov/statafiles.html). Could you please let me know why it doesn't work?

Daryosh
Comment

Christophe Kolodziejczyk

Join Date: Mar 2014
Posts: 377

18 Aug 2016, 05:45

You code $ML_yl in stead of $L_yl, which does not exists. You can see when you run the program with trace on. The output shows that $ML_yl resolves to nothing, i.e. it's empty.
It's a Stata convention that in a ML program the dependent variables's name are stored in global $ML_y1, $ML_y2, $ML_y3, etc. It seems that you consistently mistake 1 for l and l for 1 (and 0 for O).

I have some suggestions to your code.

Code:

program define d_h
version 14
args Inf theta1 theta2 theta3 theta4 // <- changed l to 1
tempvar d p z p0 pll yt 
quietly gen double `d'= $ML_y1 >0  // <- changed 
quietly gen double `p'= normprob(`theta3') 
** quietly gen double `l'= `theta4'  // is it not l instead of 1? This line is not necessary anyway
quietly gen double `yt'= ($MLy1^`theta4'-1)/`theta4'  // <- changed l to theta4
quietly gen double `z'= (`yt'-`theta1 ')/(`theta2') 
quietly gen double `p0'= 1-(`p'*normprob(-`z')) 
quietly gen double `pl' = (($ML_y1 + (1-`d'))^ (`T'-1))*`p'*normd(`z')/`theta2' // <- changed. Btw what is `T', it is not defined elsewhere?
quietly replace `lnf' = ln((1-`d')*`p0' + `d'*`pl') 
end

You wil encounter another problem though since `T' is not defined

ps. try to use the Stata code style when writing code. You do that with the # button in the editor

Comment

Daryosh Jan

Join Date: Jul 2016

Posts: 22
#6

18 Aug 2016, 08:14

Actually, I didn't write these codes. I got the codes from the paper titled "Hurdle Models of Loan Default" http://link.springer.com/article/10....e.jors.2601922

When I copied the codes for some reason it changed to " `T' while there is no `T' there. Here, see the correct codes:

[CODE]
global listy educ cigpric white age agesq income
global listd educ white age agesq
program define d_h
version 14
args Inf theta1 theta2 theta3 theta4
tempvar d p z p0 pll yt
quietly gen double `d'= $ML_y1 >0
quietly gen double `p'= normprob(`theta3')
quietly gen double `l'= `theta4'
quietly gen double `yt'= ($MLy1^`theta4'-1)/`theta4'
quietly gen double `z'= (`yt'-`theta1 ')/(`theta2')
quietly gen double `p0'= 1-(`p'*normprob(-`z'))
quietly gen double `pl' = (($ML_y1 + (1-`d'))^ (`l'-1))*`p'*normd(`z')/`theta2'
quietly replace `lnf' = ln((1-`d')*`p0' + `d'*`pl')
end
ml model lf d_h (cigs=`listy') () (eversmoked=`listd') ()
ml trace on
ml maximize
/CODE]

When I execute the above codes I get the following error:

-> d_h __000008 __000009 __00000A __00000B __00000C
- `begin'
= capture noisily version 14.1: d_h __000008 __000009 __00000A __00000B __00000C
------------------------------------------------------------------------------------------------------------------ begin d_h ---
- version 14
- args Inf theta1 theta2 theta3 theta4
- tempvar d p z p0 pll yt
- quietly gen double `d'= $ML_y1 >0
= quietly gen double __00000D= cigs >0
- quietly gen double `p'= normprob(`theta3')
= quietly gen double __00000E= normprob(__00000B)
- quietly gen double `l'= `theta4'
= quietly gen double = __00000C
too few variables specified
-------------------------------------------------------------------------------------------------------------------- end d_h ---
- `end'
= set trace off
r(102);

end of do-file

r(102);

Could you please let me know why is that? I have enough number of variables though.
Thank you,

Daryosh

ps: I used the # to write the codes. As It is my first time to use that, I might make mistake and for that please forgive me. If codes not appeared as its required, I am sorry for that.
Comment
Christophe Kolodziejczyk

Join Date: Mar 2014

Posts: 377
#7

18 Aug 2016, 14:20

Here is the code which seems to work (gives some output at least). If you have copied-paste the code directly from the pdf, some of the fonts might have been mixed up, like the 1's become l's, p1 becomes pll and so on. Also some of the characters add disappeared in some places, underscores for example. $ML_y1 became $MLy1. I guess that's why you get errors when you run the code, which in theory should work.

To answer more specifically your question the local l is not defined in your tempvar statement, so when when you try to compute `l', Stata has no knowledge that the local exists, hence the error. That's what the trace of the program tells you.

Another remark is the use of the function normd(). This function has no longer been part of Stata for a while and has been replaced by normalden(), which I have changed in the code. The original code is quite old (back to 2005 at least) and you will get into trouble if you use newer versions of Stata. So either you use the same version of Stata as the authors or you use the newer functions.

I assume that eversmoked is just a dummy argument which is not used during estimation but necessary to get the syntax to work properly.

Finally, I will recommend to get a copy of Maximum Likelihood Estimation with Stata. It is a very useful book if you want to program maximum likelihoods.

Code:

use smoke.dta , clear gen eversmoked = cigs>0 global listy educ cigpric white age agesq income global listd educ white age agesq cap prog drop d_h program define d_h version 14 args lnf theta1 theta2 theta3 theta4 tempvar d p z p0 p1 yt quietly gen double `d'= $ML_y1 >0 quietly gen double `p'= normprob(`theta3') // quietly gen double `l'= `theta4' quietly gen double `yt'= ($ML_y1^`theta4'-1)/`theta4' quietly gen double `z'= (`yt'-`theta1 ')/`theta2' quietly gen double `p0'= 1-(`p'*normprob(-`z')) quietly gen double `p1' = (($ML_y1 + (1-`d'))^ (`theta4'-1))*`p'*normalden(`z')/`theta2' quietly replace `lnf' = ln(cond(`d',`p1',`p0')) //ln((1-`d')*`p0' + `d'*`p1') end ml model lf d_h (cigs=`listy') () (eversmoked=`listd') () ml trace off ml maximize

Concerning the use of the # you were almost there . For some reasons a bracket disappeared which gave an intended result. Try to practice in the "sandbox" forum (I didn't come up with that name) to feel more comfortable with the editor.
1 like
Comment
Daryosh Jan

Join Date: Jul 2016

Posts: 22
#8

18 Aug 2016, 18:23

You did a great job and solved my problem. You are right that copying the codes from pdf may have caused problem for the codes and also the version of stata. Anyhow, I got the results as equation1 to equation 4 but I can not see my independent variables coefficient as indicated in the paper. Could you please let me know how to get the individual coefficients for every variable listed? Like for "white, age and etc"?

FYI, I just ordered the book and that may be helpful to me.

Thanks a lot for your timely help and support.
Daryosh Jan
Comment
Christophe Kolodziejczyk

Join Date: Mar 2014

Posts: 377
#9

19 Aug 2016, 01:10

listy and and listd are defined as globals but called as locals. Since these are empty it estimates the model with constants only. I should have noticed earlier...
But it's easy to fix

Code:

ml model lf d_h (cigs=$listy) () (eversmoked=$listd) ()
1 like
Comment
Daryosh Jan

Join Date: Jul 2016

Posts: 22
#10

21 Aug 2016, 09:52

Thanks a lot. It works perfectly fine by now.
Comment

Announcement