New on SSC: - prodest - module for production function estimation

Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#91

16 Oct 2019, 08:31

Dear Leon,

what you mentioned is in fact the main issue practitioners face when using the Olley-Pakes methodology. In order to overcome this issue Levinsohn-Petrin proposed to use intermediate input - which should not, in principle, contain zeros or missing values for active firms - not surprisingly, their method quickly became the benchmark model.

All this boring intro just to answer that, unfortunately, with the available data either you impute the data in a reliable way or you'll have to find a way to collect more data on inputs.

Good luck!

Gabriele
Comment
Leon Schmidt

Join Date: Apr 2018

Posts: 98
#92

16 Oct 2019, 09:00

Thanks a lot Gabriele! Then I´ll consider my options. I guess imputing data on inputs or investments is quite difficult, at least I haven´t seen papers on it (if you have a reference please let me know). But I´ll see what I can do.
Comment
Guanyu Zheng

Join Date: Aug 2014

Posts: 2
#93

19 Nov 2019, 17:51

Dear Gabriele,

Very nice to have your Stata and R programmes to run different production functions. nice work!

I have been running some production functions with location dummies and tried to get some senses about productivity premiumns on locations. the code is like below
prodest rlngo , free(lnl) state(rlnk) proxy(rlnm) control(loc_2-loc_24) poly(3) id(pent) t(dim_year_key) met(wrdg)

The location dummies are loc_2-loc_24, there are 24 locations, set the first location as the reference point.

However, Stata return an error message on "conformability error". I tried to move these locations to state( ) (personally don't think they should be there, but i tried to see any improvements) and have another message "no room to add more variables up to 5000 variables are currently allowed, althoug you could reset the maximum using set maxvar".

When I used met(lp), prodest rlngo , free(lnl) state(rlnk) proxy(rlnm) control(loc_2-loc_24) poly(3) id(pent) t(dim_year_key) met(lp), no more errors are found except the running time is a lot longer.

Can you give me some hints how to run these dummy variables in your programme? It's important to me to know how to run these programme properly as I'll include more variables in the production functions (for example, competition, innovation)

Cheers,
Guanyu
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#94

21 Nov 2019, 05:51

Dear Guanyu,

I have replied to your email directly.

Best,

Gabriele
Comment
Hyejin Lee

Join Date: Nov 2018

Posts: 8
#95

02 Mar 2020, 06:23

Dear Gabriele,

I have a specific question about the prodest command. I am estimating markups of each firm.

I am not sure how omega(productivity) or xi (the innovation or productivity shock) is computed for the first year of my data. The period of my data is 2008-2014.
Given the embedded algorithm and their methodology(see below), I need the lag value of each observation in the law of motion of productivity: I have no lagged observations for the year 2008 to calculate the omega_lag. Then, omega for the year 2008 cannot be calculated due to the absence of value for XI (the innovation error term for law of motion of productivity).

Thus, I would expect not to have any estimates for the first year of my data (2008), but my results even include the values for the year 2008.
In "Prodest", how is it calculated for the first year of the dataset (2008)?
My conjecture is that OMEGA_lag_pol become 0 ? because there is no data available for the lag of the year 2008?

Best,

Below is the part of mata that I extract from De Loecker & Warzynski (2012).
mata:
void GMM_DLW(todo,betas,crit,g,H)
PHI=st_data(.,("phi"))
PHI_LAG=st_data(.,("phi_lag"))
Z=st_data(.,("const","l_lag","k"))
X=st_data(.,("const","l","k"))
X_lag=st_data(.,("const","l_lag","k_lag"))
Y=st_data(.,("y"))
C=st_data(.,("const"))

OMEGA=PHI-X*betas'
OMEGA_lag=PHI_LAG-X_lag*betas'
OMEGA_lag_pol=(C,OMEGA_lag)
g_b = invsym(OMEGA_lag_pol'OMEGA_lag_pol)*OMEGA_lag_pol' OMEGA
XI=OMEGA-OMEGA_lag_pol*g_b
crit=(Z'XI)'(Z'XI)
}

Last edited by Hyejin Lee; 02 Mar 2020, 06:30.
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#96

02 Mar 2020, 09:33

Dear Hyejin,

I am glad that you found prodest useful.

You miss some important details in your email, so in my answer I'll have to guess some of the parts that you skipped: i) you use the predict <newvar>, omega postestimation command in prodest to get the omega you mention, and ii) you use an ACF-corrected method.

According to the documentation (h prodest_p in Stata) the predict <newvar>, omega yields the TFP as the difference between the first-stage fitted values (phi_it) and the fitted y at the estimated parameter level. In other words, assume that \beta_{k}^{*} and \beta_{l}^{*} are the second-stage final ACF estimates of the capital and labor parameters, the command returns, for each firm/year,

\omega_{it} = \phi_{it} - \beta_{k}^{*} * k - \beta_{l}^{*}

which, as you might guess, is independent of the years for which the estimation is available - in your case, 2008 is not in the estimation sample, but still has a valid \omega_{it} value.

I hope to have clarified,

Gabriele
Comment
Hyejin Lee

Join Date: Nov 2018

Posts: 8
#97

03 Mar 2020, 03:24

Dear Gabriele,

Thank you very much for your clear answer! I found prodest very very useful.
Indeed I did not include important details in the previous question. I run the following by each industry:
prodest y, free(l) state(k) proxy(m) control(dy*) met(lp) opt(nm) eval(d0) reps(50) fsres(epsilon_fr)

I used the predict <newvar>, omega afterwards, but I did not use ACF-corrected method because of the CD production function.
In addition, I corrected for the measurement error using fsres. But I don't think the differences will change your answer.

\beta_{k}^{*} and \beta_{l}^{*} are the estimates that do not include the first year in the estimation procedure. Then, I think it is more precise NOT to use/report the value (omega) of the first year of my dataset. And some firms do not have observations for all the years - unbalanced panel data, so I guess not only just the 2008, but also I should drop/exclude observations of any first year data of each firm. Would this be correct?

Best regards,
Hyejin
Comment
Giorgio Presidente

Join Date: Jun 2020

Posts: 5
#98

22 Jun 2020, 07:17

Ciao Gabriele Rovigatti,

Thanks for writing prodest.

What if I want to implement an input price control function B(p), similar to DLGKP2016? For the first stage, I guess we can use the control() option to include linearly p and the appropriate interaction terms between p and the inputs of the production function k, say p*k. Then, prodest will perform a polynomial expansion of p, k, p*k and the proxy variable.

However, I am concerned about the second stage. I am afraid that prodest will form the wrong moment conditions if I follow this strategy. For instance, the moment E(xi*p), with xi being the innovation to unobserved productivity, will not be equal to zero. It should be something like E(xi*p(t-1)).

Assuming that my understanding of the methodology of DLGKP2016 and prodest is not completely wrong, is there a way to account for this issue? Below an example of the code I am running:

prodest va, free(l) state(k) proxy(m) valueadded met(lp) poly(2) control(p p2 pl pk pm) endogenous(lag_dexport) acf attrition fsresiduals(fs_acf_op)

Thank you in advance for your help!

Best
Giorgio

Last edited by Giorgio Presidente; 22 Jun 2020, 07:24.
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#99

22 Jun 2020, 14:24

Ciao Giorgio Presidente ,

thanks for the message, I am glad that you considered prodest for your research.

I am afraid I have to confirm your suspects: in the case you mention the moment conditions for the second stage computed by prodest would only refer to the contemporaneous variables - i.e., E(x_{it} p_{it}), and not to the correct E(x_{it} p_{it-1}). My sense is that it would not be impossible to work out a customized version of the command which would implement the correct moments.

I wish I could have been of more help.

Best,

Gabriele

Last edited by Gabriele Rovigatti; 22 Jun 2020, 14:26.
Comment
Giorgio Presidente

Join Date: Jun 2020

Posts: 5
#100

23 Jun 2020, 06:11

Gabriele Rovigatti,

Thanks for your quick response.

Ok, I see. To adjust the program, I suspect one could allow the variables in control() to behave as if they where in free(), essentially telling prodest to take lagged values in the second stage. If I find the time to look into it I will let you know.

Maybe one trick could be to give up contemporaneous values of p in B and just include interactions between p(t-1) with current state variables, and p(t-1) with lagged free variables. In this way the second stage would be fine, I guess.

Best
Giorgio
Comment
Xiaotian Hu

Join Date: Mar 2020

Posts: 2
#101

26 Jul 2020, 00:33

Gabriele Rovigatti
Dear Professor Gabriele

currently I am estimating firm level productivity using the command you proposed "prodest". my specification is va=Aand my estimating code is:

prodest lnva, free(lnl) state(lnk age) proxy(lnm) control(wto ex) attrition va poly(3) met(lp) reps(50) id(firm) t(year)
gen lntfp_lp=lnva-_b[lnl]*lnl-_b[lnk]*lnk

and lnl is log labor, lnk is log of capital stock, lnm is log of intermediate inputs, wto is a dummy denoting country's entry into WTO, ex is a dummy representing whether a firm exports and age is firm's age.

To get the firm productivity, is my code right?

in your program, predict, omega is to generate predicted values of omega, is this the same as what I generate before(labelled in red)? in my data, whether controls should be incorporated into the lntfp_lp generation as in red?

and does option"attrition" refer to firm's exit the sample eventually or also including the exit during the sample period? for example, my data covers 2000 to 2006, and one firm have data in year 2000,2001, 2003, 2005. Then attrition means exit in 2006, the year following 2005? or exit in 2002,2004 and 2006?

looking forward to your reply

Thanks

Xiaotian
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#102

29 Jul 2020, 02:27

Dear Xiaotian,

I am glad that you found prodest useful for your research.

1) The prodest predict, omega routine implements a slightly different model than the one you report. First, because you have two state variables (lnk + state), and you should take care of both in the computation of the omega residuals. Second, because in order to obtain omega you should net the phi_{hat} from the first-stage residuals, too. In order to do that, you might consider running prodest with the fsresiduals(<newvar>) option - and run predict, omega afterward.

2) The option attrition is not super-optimized in prodest, and might not work as expected. In fact, when using attrition the command generates, in the background, an exit dummyvariable that takes value 1at time t for firm i in case the firm does not appear in the sample at t+1, and t is not the last available date in the panel (the idea being that firms which do not appear in a year before the last one exited the market). I acknowledge, however, that there are several reasons why a firm might not appear in the sample, other than exit - I think that you might easily work out a better solution within the adofile. Long story short, to answer your question: currently in your example, the attrition option would label exit in all years 2002, 2004 and 2006.

I hope to have clarified,

Gabriele
1 like
Comment
Lucia Jiang

Join Date: Aug 2020

Posts: 4
#103

07 Aug 2020, 09:11

Dear Gabriele,

Thank you so much for writing the prodest package, really helpful!

Currently I've been working on a project and trying to estimate firms' productivity and markups using different methods.
When I used LP or OP with ACF correction, I've encountered the issue that the standard errors are missing (output attached).
I noticed that you mentioned in the previous post the issue might come from the starting points, so I also tried to play around with different initial points using "init", but still no standard errors can be reported.

Here is my command:
prodest lva, free(lemp) proxy(linv) state(lcap) method(op) acf fsresiduals(fs_va_oacf) valueadded reps(50)

It would be pleasure to hear from you.

Thanks,
Lucia

Attached Files
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#104

12 Aug 2020, 14:47

Dear Lucia,

I am glad that you found prodest useful, and apologies for the late reply.

Unfortunately, without looking at the code + data (or at least at a MWE which reproduces the error) I cannot be able to provide a definitive answer. Hence, I would recommend you - if possible - to send it to me directly to [email protected]

In fact, the issue that you describe is very unusual and makes me think to memory-related errors rather than starting points issues. In the latter case, the state variable standard errors would be missing, whereas in your case free variables' SEs are missing, too, and that cannot be related to second-stage starting points - being free vars' SEs taken directly from the first-stage regression.

I hope that it will help,

Gabriele
Comment
Gabriele Rovigatti

Join Date: Sep 2016

Posts: 73
#105

17 Aug 2020, 07:44

** Working Paper Alert **

Shameless self-promotion time: several times I have been asked whether it would be possible to estimate firm markups à la De Loecker and Warzynski (AER, 2012) with prodest. Now it is! I have added an option ("markups") to the predict routine, which allows for markup estimation based on prodest results.

The addition relates to a new command, markupest, which implements DLW and three additional macro models for markup estimation. Interested readers may refer to the companion working paper here, or to the replication + example code here.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment