Roodman's cmp command

Carmen Blanco-Arana

Join Date: Aug 2015

Posts: 3
#1

Roodman's cmp command

25 Aug 2015, 05:23

Dear colleagues,

I'd like to study a model of poverty transitions by using Jenkins' equations (2011 book), that is:

Y1 = a*X
Y2 = b*W
Y3 = (c*Y1+d*(1-Y1))*X

Y1, Y2, Y3 are binary variables, X, W are vectors of explanatory variables, and a, b and c are vectors of coefficients.

I want to use cmp command (Roodman, 2009) because I am studying a multilevel model (individuals within countries).

I am trying in the following way, but I am not sure how to implement the third equation:

cmp setup
cmp (Y1= X || country (Y2= W || country (Y3 = ?? || country: ) ($cmp_probit $cmp_probit $cmp_probit)

Please, could someone help me?

Thank you very much in advance

Carmen B.
Tags: None
Stephen Jenkins

Join Date: Apr 2014

Posts: 1439
#2

25 Aug 2015, 10:26

Carmen: welcome to the Forum! Please take a few minutes to read the Forum FAQ -- hit the dark bar at the top of the page just under the Statalist banner heading. That will tell you a number of things that you should do in order to maximize the chances of getting helpful responses. Example: use full/complete bibliographic references so that all readers know what you're referring to. Example: Explain more precisely what "Y" is. Example: Report code fragments using CODE delimiters. Example: cite where user-written commands like cmp come from. Example: (re)register so that you use your fullname (firstname lastname) on the Forum -- it's easy to fix this. All is in the FAQ.

If you are referring to my book (http://ukcatalogue.oup.com/product/9780199226436.do), know also that the material you are referring to was mostly published earlier (Cappellari and Jenkins, Journal of Applied Econometrics 2004) -- a source that may be more accessible to most readers, unless their institution's library has access to OUP's Oxford Scholarship Online series. So cite the article (fully!) -- for the reasons given in the FAQ about maximizing your chances of getting a helpful answer.

After all that, I have to tell you that cmp is probably not going to help you because it doesn't deal with panel data (my research was based on British Household Panel Survey data). Also, in case you were thinking of asking, I don't have the Stata code that was used to derive the original estimates. However, the methods could be implemented using the techniques outlined in and materials accompanying Cappellari and Jenkins, Stata Journal 6(2), 2006 -- a free download.

What Y1, Y2, and Y3 are, and also the specific contents of X and W, is really important. One of the key issues in these sorts of problems is how to deal with Initial Conditions issues (Y1 equation?), and also Sample Drop-out (attrition; Y2 equation). Getting good instruments for sample retention and initial poverty status is hard!

You appear to be adding an additional dimension to our work, namely pooling data on individuals from multiple countries and then using multilevel (random effects) modelling methods. If the number of countries is small, then I would mistrust the statistical reliability of your estimates of 'country effects'. For elaboration, see ‘Regression analysis of cross-national differences using multi-level data: a cautionary tale’, Open Access article at http://esr.oxfordjournals.org/cgi/content/full/jcv059
Comment
Carmen Blanco-Arana

Join Date: Aug 2015

Posts: 3
#3

26 Aug 2015, 05:14

Dear Stephen,

Thank you very much for your help. It is very useful for my doctoral thesis.
I am going to follow your recommendations.

Regards,

Carmen B.
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#4

26 Aug 2015, 07:48

Carmen, from a narrow econometric point of view, the challenge with your model for cmp is the interaction term X*Y1. cmp can't really handle that.

I'm not quite sure what Stephen means in saying cmp doesn't deal with panel data. It can fit random effects and differenced and fixed effects models--although the latter requires you to explicitly enter dummies for each group, which can become impossible (if you hit a Stata limit) or slow in practice. Examples are in the help file.

I'm sure Stephen's deeper econometric points stand.
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1439
#5

26 Aug 2015, 08:44

I stand corrected for my snap judgement regarding this aspect of cmp's capabilities. (David already knows I think it's a fantastic program.) My remark to Carmen was made on the basis of (a) recollection (perhaps faulty) of email with David long ago about the scope for cmp to fit various longitudinal models, and (b) [related] I don't think the older versions of cmp I was looking at then had the RE modelling feature. (Again, I may remember wrongly.)
Whatever, I'll stick by my remarks that I think Carmen will not find cmp useful for her particular 3-equation estimation problem because of its particular structure (not only the panel aspect), especially the "switching" aspect of equation 3. However, the likelihood expressions for the model are given in the original C-J article. Given those, I think Carmen could fit her model drawing on the C-J 2006 SJ article discussion.
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#6

26 Aug 2015, 11:59

Carmen, I realized I wasn't thinking clearly before. Leaving aside the hierarchical aspect I think you could do something like:

Code:

cmp (Y1 = X) (Y2 = W) (Y3 = X) (Y3 = X), ind($cmp_probit $cmp_probit "(1-Y1)*$cmp_probit" "Y1*$cmp_probit")

In principle, you can also make this hierarchical

Code:

cmp (Y1 = X || country:) (Y2 = W || country:) (Y3 = X || country:) (Y3 = X || country:), ind($cmp_probit $cmp_probit "(1-Y1)*$cmp_probit" "Y1*$cmp_probit")

or:

Code:

cmp (Y1 = X || country:) (Y2 = W || country:) (Y3 = X || country:) (Y3 = X || country:), ind($cmp_probit $cmp_probit "(1-Y1)*$cmp_probit" "Y1*$cmp_probit") covariance(independent unstructured)

The latter specifies that the random effects are uncorrelated across equations. Whether these would converge and whether they'd be reliable, I don't know. It would be tough if the number of countries is small. You instead also try adding country fixed effects (with something like "i.country" in the variable lists), unless Stephen thinks that's a bad idea.
1 like
Comment
Stephen Jenkins

Join Date: Apr 2014

Posts: 1439
#7

27 Aug 2015, 02:05

David's code fragments at #6 are a neat use of cmp! He also writes:

The latter specifies that the random effects are uncorrelated across equations.

Allowing for cross-equation correlations is of the essence of the C-J model. If all cross-equation correlations are zero, then one can estimate each of the three equations separately. It's one way of identifying the system! But it assumes away the problems of selection on unobservables in initial conditions and sample retention. Economists typically think these are issues to deal with.
Comment
Carmen Blanco-Arana

Join Date: Aug 2015

Posts: 3
#8

27 Aug 2015, 04:55

Thank you very much to both for your suggestions.

As Stephen said at #2, I am doing a study of poverty dynamics by using equations in the Cappellari and Jenkins' article (Cappellari and Jenkins, Journal of Applied Econometrics 2004) in a multilevel framework.

So, the first equation is the initial conditions equation being Y1 equal to 1 if individual is poor at time t-1, and 0 otherwise; the second equation is called attrition equation where Y2 takes value 1 if the individual is in the sample at time t-1 and t, and 0 otherwise; and the last equation is the transition equation being Y3 equal to 1 if individual is poor at time t, and 0 otherwise. X and Z are vectors of explanatory variables.

I think that Cappellari and Jenkins' article (Cappellari and Jenkins, Stata Journal 6(2), 2006) jointly with David's codes at #6 are going to help me in order to reach the right way in my research.
Comment
Martin Paul

Join Date: Jun 2017

Posts: 36
#9

17 Aug 2017, 17:10

Dear statalist

I am using the cmp command to estimate a triple hurdle model of Production and commercialization. My first and second hurdle is a probit model while the third hurdle is a truncated normal regression.
However, after following the cmp set up and running the regression, i get error 198 with the message 'unmatched quote, invalid syntax'
I am very new to stata, What could be the problem and the necessary solution?

cmp(pdtndecision = plantimp head_age head_gen chkpexpr head_edu radio wlkdsmnm dstfrcop dstextag cultarea lPrice_improved offfarm_income TLU totallab No_ofcropsgrown ag_machi lrainfall plantimpbar head_edubar radiobar wlkdsmnmbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totallabbar No_ofcropsgrownbar ag_machibar rainfall i.year i.district) (MP = plantimp residual2 head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar) (qtysold = plantimp residual head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar), indicators("pdtndecision*$cmp_probit" "MP*$cmp_probit" "qtysold*$cmp_trunc) difficult nonrtolerance qui
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30194
#10

17 Aug 2017, 17:18

I think the problem is near the end, in the -indicators()- option.

Code:

"qtysold*$cmp_trunc

needs a closing " at the end.

Here's a hint for similar problems in the future. It is pretty easy to find unbalanced quotes in the do-file editor. Quoted strings that properly begin and end with a ", or that begin with `" and end with "' are displayed in brown, whereas normal code is in black. So if you see something that you think is a quoted string and it isn't shown in brown, then you know that something is wrong with it. Then you just scrutinize that little piece of code.
Comment
Martin Paul

Join Date: Jun 2017

Posts: 36
#11

18 Aug 2017, 01:22

Thanks Clyde for the hint
However, after correcting, I still get this 'Indicator for qtysold must only evaluate to integers between 0 and 10'. Invalid syntax.
Haven gone through "David Roodman. 2009. "Estimating Fully Observed Recursive Mixed-Process Models with cmp.", I still can't get a possible hint to the problem.
What could be the issue?

cmp(pdtndecision = plantimp head_age head_gen chkpexpr head_edu radio wlkdsmnm dstfrcop dstextag cultarea lPrice_improved offfarm_income TLU totallab No_ofcropsgrown ag_machi lrainfall plantimpbar head_edubar radiobar wlkdsmnmbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totallabbar No_ofcropsgrownbar ag_machibar rainfall i.year i.district) (MP = plantimp residual2 head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar) (qtysold = plantimp residual head_age head_gen chkpexpr head_edu hh_income motor_tr m_phone wlkdsmnm qldmnmkt tscstmmkt dstfrcop dstextag cultarea Price_improved offfarm_income TLU totalprod tvalue_h i.year i.district plantimpbar hhsizebar motor_trbar m_phonebar wlkdsmnmbar qldmnmktbar tscstmmktbar dstfrcopbar dstextagbar cultareabar Price_improvedbar offfarm_incomebar TLUbar totalprodbar tvalue_hbar), indicators("pdtndecision*$cmp_probit" "MP*$cmp_probit" "qtysold*$cmp_trunc") difficult nonrtolerance qui
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#12

18 Aug 2017, 06:05

Well, the message seems to be saying that "qtysold*$cmp_trunc" doesn't evaluate to an integer between 0 and 10 for all observations. Have you investigated this?
Comment
Martin Paul

Join Date: Jun 2017

Posts: 36
#13

18 Aug 2017, 06:21

David, if I understand well, a truncated regression is for positive integers and must not be limited to 0 and 10. For my data set, I have positive integers which are even above 10. So I employ truncated specification to tackle this since I dropped all zeros observations.
Whats the way forward?

The same applies even when I use "qtysold*$cmp_cont"

What cmp command can i use to tackle this relationship
Comment
David Roodman

Join Date: Jul 2014

Posts: 478
#14

22 Aug 2017, 08:13

The help file contains an example or so of a truncated regression. I would start with that, make sure you understand it, then change it step by step into the model you actually want. Start simple.
Comment
Martin Paul

Join Date: Jun 2017

Posts: 36
#15

23 Aug 2017, 05:44

I worked from the help file; the reason why it is difficult to understand where the error comes from. Also, when I interchange the position of the hurdles, I notice this error always comes on the last sted hurdle in the model. For example, when the first hurdle is in the last position, I still get this error ''Indicator for 'pdtndecision' must only evaluate to integers between 0 and 10'. Invalid syntax.

What could the problem be as the 'pdtndecision' hurdle has just two outcomes; 0 and 1.

any hints
Comment

Announcement

Roodman's cmp command

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment