New on SSC: - prodest - module for production function estimation

Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#76

19 Nov 2018, 02:10

Dear Yihao,

I am glad that you found prodest useful.

Unfortunately, I am not able to provide any help without looking at the code and/or the underlying data in these cases - it might be due to the seed, the data, some other issue; even though with OP you should NOT encounter such a behavior. As specified in the helpfile, please send a reproducible example of the data (i.e., sample of the data + a snippet of the code able to reproduce the error you are mentioning) alongside a brief description of the issue to [email protected] .

Best,

Gabriele
Comment
Oliver Kendrick

Join Date: Apr 2016

Posts: 12
#77

30 Dec 2018, 07:53

Dear Gabriele Rovigatti

Thank you for a really excellent command - having gone into the ado files I am beyond impressed. Also very grateful that you have engaged so much with users here.

I have a question about an extension of the approach you implement. It seems to me that De Loecker, Eeckhout & Unger 2018 (DLUE) (https://sites.google.com/site/deloec...edirects=0&d=1) - which is an update of the well-known De Loecker & Eeckhout NBER paper of the same name - implement a kind of ACF-type production function estimation but without using a proxy like materials or investment. Their materials and labor inputs are all bundled into COGS (Cost of Goods Sold) which they use as a bundled variable input. See pages 9-10 of DLUE, including footnote 15, and pages 14-15. There is clearly no identification in the first stage of the control function approach if you are doing this, but as per ACF they identify with the 2nd stage GMM anyway.

If I am correct in understanding this I have two questions:
1) Would there be a way to implement this approach within -prodest-? Currently the free and proxy variables are required, and they must be different. I was wondering if you know of a way around this. I could write my own code if not, but your procedure is much more sophisticated than I would write.
2) What do you think of the the DLUE procedure as outlined above? Do you think the labor coefficient is plausibly identified? I know there are issues concerning the elasticity of substitution within the COGS inputs, prices etc - but I'm wondering in general about production function estimation without using a particular variable as a productivity proxy. Could it be okay in the context of an ACF-type method where you specify an AR-1 process for productivity and identify in the second stage?

Best,

Oliver
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#78

06 Jan 2019, 04:20

Dear Oliver Kendrick ,

thanks for the message and sorry for the delayed answer. I will try to answer your questions below:

1) At the moment, prodest does not directly implement DLEU model - nor DLW (2013), which if I am not mistaken is the paper in which the method was originally developed. However - and given that you are able to go through the adofile - the actual implementation of ACF in prodest is just one step from it, and you should be able to easily adapt it for your purposes;
2) I know that some researchers - well above my expertise level, actually - cast doubts on the identification of the labor coefficient through the DLEU method. On the other hand, I record that DLEU is R&R at the QJE, and that the last available version (Nov 2018) confirms the empirical strategy. Given all that, I am not able to provide a definitive answer on such a hotly debated topic: I am sure you will find several academic sources to rely on.

I hope to have clarified,

Gabriele
1 like
Comment
Oliver Kendrick

Join Date: Apr 2016

Posts: 12
#79

07 Jan 2019, 11:31

Thanks for the reply Gabriele Rovigatti! I appreciate your engagement, and of course completely understand being agnostic on question 2.

If I could ask a follow-up question: when you say that prodest does not implement the DLEU or DLW approach but it is close, I just wanted to confirm as to what the differences are you refer to.

I see the following, but wanted to check if I had missed anything:
1) DLEU use the same variable for the free input and as the proxy; this is not the case for ACF or prodest.
2) DLEU and DLW suggest a process for productivity which is more general than AR(1) (though in DLW's mock code they do actually use an AR(1) process); ACF suggest an AR(1) process and this is what is used in prodest. [I don't think I would change this in the code?]
3) Is there a slight difference in the second stage's moment conditions? With ACF having a moment condition using phi_{t-1} whereas DLW do not?

Again, thanks so much for your time. I understand that this moves the conversation away from prodest and towards the theoretical literature - I understand if you don't want to spend time discussing this stuff. Just thought I would ask as I am learning this literature for the first time and I clearly do not yet get all of its nuances.

All the best!
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#80

14 Jan 2019, 03:16

Dear Oliver,

again, sorry for the late reply.

Below I will address the questions you raised:

1) Correct, that is one of the most controversial aspects of DLEU;
2) Again, all correct - but I'd follow DLW in using an AR(1) process (moreover, that is what is actually already implemented in prodest) ;
3) I am not sure to get the point. My understanding is that both DLW and ACF use similar moment conditions - the \phi_{t-1} in e.g. ACF equation (28) is actually a compact way of writing the AR(1) process.

I hope to have clarified,

Gabriele
Comment
Oliver Kendrick

Join Date: Apr 2016

Posts: 12
#81

01 Mar 2019, 13:31

Dear Gabriele Rovigatti

Thank you for the reply - and sorry for my own lateness, I didn't see this. You have clarified completely, thank you and I really appreciate your help on this.

Best,

Oliver
Comment
Dave Cox

Join Date: Oct 2017

Posts: 5
#82

04 Apr 2019, 08:04

Dear Gabriele,

First of all, thank you for providing the prodest program. However, I have three questions that left me a bit puzzled. I hope you find the time to help. I estimated a production function like: y = b_l*l + b_k*k + b_m*m*omega, where y is revenue, l is labor, k is capital, m is material cost and omega is TFP. All in logs of course.

1) I used both the acfest command and your prodest command. However, even though I think I specified them to estimate the same equation using the same methods, I get completely different results. I also noticed that your command uses more observations than the acfest command. I found the same issue in your "Theory and Practice of TFP Estimation: The Control Function Approach Using Stata" paper in table 3. How can that be? The code I used is as follows:

Code:

acfest l_y , state(l_k) proxy(l_m) free(l_l) nbs(200) prodest l_y , state(l_k) proxy(l_m) free(l_l ) reps(200) method(lp) acf

2) I am a bit unsure which TFP measure to use. I already found a short discussion in this forum but it did not become completely clear to me. The acfest and levpet commands calculate TFP using: TFP = y - (b_l*l + b_k*k + b_m*m) . This does not exclude epsilon since b_l*l + b_k*k + b_m*m omega + epsilon - (b_l*l + b_k*k + b_m*m) = omega + epsilon
Using your command

Code:

predict TFP_2, omega

does exclude epsilon if I see it correctly. This is because psi in the first stage excludes the error term and therefore: psi(m,l,k,omega) - (b_l*l + b_k*k + b_m*m) = omega
My question is therefore: Can one say that your procedure is correct and the other ones are "wrong"? TFP is most of the time defined as the complete residual. Your version however excludes a part of the residual.

3) The prodest command allows for controls to be included in the estimation. I cannot find an explanantion in the documentation or your paper mentioned above how they are included in the estimation. Are they treated like the variables in free()?

Thanks a lot. I am looking forward to your answers.

Best,
Dave
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#83

04 Apr 2019, 08:35

Dear Dave,

I am glad that you found prodest useful.

Let me try to answer your questions below:

1) In the prodest companion paper - I refer to the latest version, published on the Stata Journal and available here - I devote a couple sections to the discussion on ACF limits in empirical applications, and possible issues with its implementation on real data. The empirical analysis, somehow confirmed and strengthened by a recent paper on the JAE (here), stresses how different starting points, different optimizers and different solving algorithms may lead to different results (i.e., local maxima) in ACF second stage estimation. This is exactly the case in several applications using prodest - with all available optimizers - and acfest, and probably in yours, too. My only suggestion would be to use different starting points for the optimization - prodest features an undocumented init() option for it - and different optimizers. Last, acfest does report the number of actual observations on which the estimation takes places, by excluding all the observations in the estimation sample with no valid lags; prodest, on the other hand, keeps them into the total count, while obviously both estimations are run on the same sample.

2) It is a matter of "taste", somehow. In prodest, you can either use predict <newvarname>, residuals and generate the TFP as in levpet or acfest, or use predict <newvarname>, omega, and exclude epsilon. I do not think that there is any "right" or "wrong", though, as the choice depends on the underlying assumptions of your model

3) control variables are used in the second stage estimation, too.

I hope to have clarified,

Gabriele
1 like
Comment
Dave Cox

Join Date: Oct 2017

Posts: 5
#84

05 Apr 2019, 03:41

Dear Gabriele Rovigatti,

thanks a lot for clarifying the three issues so quickly. I do not agree 100% with you in point 2. When omega is defined not to include epsilon (as in ACF, LP and OPs papers), including epsilon in the calculation of omega does not seem to make sense to me.
However, I do have one more (simple) question about the control variables. If I include a set of dummy variables (e.g. industry classification), do I need to drop one because of perfect multicolliniarity?

Thanks a lot

Best,
Dave
Comment
Susie Jia

Join Date: Aug 2019

Posts: 1
#85

03 Aug 2019, 18:53

Dear Gabriel,

Thank you for answering the question in the forum. It is very helpful besides the introduction in Stata. I am trying to run a translog using prodest. I have lnv as the free variable, lnk as the state variable and lni as the proxy variable. I know that we can do translog by adding the option "translog". But I do not want to include the input variables' intersection part, which is lnv*lnk.

In this case, I guess it is better to for me to do the translog manually. Then, should I put the (lnv)^2 and (lnk)^2 in control() or in free() or in free() and state() respectively? so, basically, I need to choose one of the listed 3 lines.

#1. prodest lnq method(op) free(lnv) proxy(lni) state(lnk) id(id) t(year) control(lnv2 lnk2)
#2. prodest lnq method(op) free(lnv lnv2 lnk2) proxy(lni) state(lnk) id(id) t(year)
#3. prodest lnq method(op) free(lnv lnv2) proxy(lni) state(lnk lnk2) id(id) t(year)

I want to calculate the elasticity for input v. Then, I plan to use the function betalnv +2*betalnv2*lnv to calculate the elasticity.

Please let me know how should the translog work if I want to do it manually?

Many thanks.
Susie
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#86

06 Aug 2019, 07:03

Dear Susie,

Let me parse your question and provide a somewhat "double" answer to it.

1) First, and most important from a methodological perspective, you say that you plan to estimate a translog PF but that you do "not want to include the input variables' intersection part". In a word, you want to estimate a kind of Cobb-Douglas PF with squared inputs. Currently, I am unable to figure out which kind of PF might such a model refer to, nor to work out its properties - that is, I would strongly suggest you to carefully outline its features and test its robustness before proceeding with the estimation.

2) My sense is that #3 is the model you're looking for, even though what you're looking for might not be what you really want to have - see 1).

I hope to have clarified,

Gabriele
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#87

16 Oct 2019, 03:08

Dear Gabriele,

Thank you very much for writing this program! I started to use it on the examples in the help-file and it is great!

I am working with historical firm-level data. Of course, we have much less information here. We have gross sales value and the number of workers per firm. From that we could calculate a productivity-measure by dividing gross sales value / number of workers, but that would be too crude I guess. Thus, we want to use a better method and came across your code. However, we lack the amount of firm-capital and investment in our data. I was wondering whether you know of any methods to estimate these from some proxies? For example, we know how much horsepower or machines a firm had. I guess that they could function as a proxy for capital. But then again, I am not sure how to calculate the investments and consequentially use your code.

I know it´s a long shot, but maybe you have come across something similar in your previous work and have any advice? Thanks a lot!

All the best
Leon
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#88

16 Oct 2019, 03:50

And as a follow-up: What does the variable investment measure in the OP-model? Is it how much was added to the capital stock or the change between the capital stocks between two years? Thank you very much again!
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#89

16 Oct 2019, 07:24

Dear Leon,

I am glad that you found prodest useful.

I will address your question below, but please be aware that the answer goes well beyond the methodological responses (that is, technical issues with the command) I usually provide here, given the nature of your problem.

As you somehow pointed out in the second part of your question, you are forced to use the OP methodology and resort to the computation/definition of some form of capital and investment. In particular, the ideal candidate I see for the former is the total horsepower that the plant can produce (which is a rather good proxy for capital, given the nature of the production, especially for manufacturing and historical data!). In turn, the investment would be the discounted difference between two consecutive years - even though I suppose that finding the discount rate in HP terms will not be an easy task! Once built capital and investment, you would be able to run prodest exactly as described in the helpfile.

I hope to have clarified,

Gabriele
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#90

16 Oct 2019, 07:54

Dear Gabriele,

Thank you very much for your prompt answer!

Indeed, it is a much broader question. I just posted it here so that it is open to others that might be interested as well.

Just one more question on the investment variable since I am only getting acquainted with the literature. Assume that I can calculate the difference between two years in the horsepower values and discount it somehow. Then by definition of the model I understand that I have to take the logarithm of this value. However, what do I do about zeros and missing values? I think it is similar for the case when you have true investment data since you could choose between gross or net investments (which can be negative or zero). Is there some sort of standard in the literature on what to do in these cases?

Again, thank you very much!

All the best
Leon
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment