New on SSC: -sfkk- module to estimate endogenous stochastic frontier models in the style of Karakaplan and Kutlu (2015)

Mustafa Ugur Karakaplan

Join Date: Jun 2015
Posts: 26

New on SSC: -sfkk- module to estimate endogenous stochastic frontier models in the style of Karakaplan and Kutlu (2015)

18 Jun 2015, 12:31

Special thanks to the Amazing Kit Baum, -sfkk- is now available on SSC. You can install -sfkk- from SSC by entering the following command in Stata:

Code:

ssc install sfkk

sfkk fits endogenous stochastic production or cost frontier models following the methodology provided by Karakaplan and Kutlu (2015). sfkk provides estimators for the parameters of a linear model with a disturbance that is assumed to be a mixture of two components: a measure of inefficiency which is strictly nonnegative and a two-sided error term from a symmetric distribution. sfkk can handle endogenous variables in the frontier and/or the inefficiency, and the sfkk estimates outperform the standard frontier estimates that ignore endogeneity. See Karakaplan and Kutlu (2015) for a detailed explanation of their methodology and empirical analyses.

Karakaplan and Kutlu (2017) provides the econometric methodology that -sfkk- is based on. This paper is published in the Economics Bulletin: http://www.accessecon.com/Pubs/EB/20...V37-I2-P79.pdf Karakaplan (2017) provides the -sfkk- program itself with some examples. This paper is published in the Stata Journal: http://www.stata-journal.com/article...article=st0466

-sfkk- help file provides several examples and they can be viewed by typing the following command in Stata after installing the -sfkk- package:

Code:

help sfkk

Below is an example of an -sfkk- output:

Code:

. use http://www.mukarakaplan.com/files/sfkkprod.dta, clear

. sfkk y x1 x2 z1, prod u(z2) en(z1 z2) i(iv1 iv2) delve nicely header beep compare


18 Jun 2015 12:09:18


ENDOGENOUS STOCHASTIC PRODUCTION FRONTIER MODEL (Model EN)

Dependent Variable: y

Frontier Variable(s): x1 x2 z1

U Variable(s): z2

W Variable(s):

Endogenous Variable(s): z1 z2

Excluded Instrument(s): iv1 iv2

Exogenous Variable(s): iv1 iv2 x1 x2


Delving into the problem...

initial:       log likelihood =  709.21899
rescale:       log likelihood =  709.21899
rescale eq:    log likelihood =  709.21899
Iteration 0:   log likelihood =  709.21899  
Iteration 1:   log likelihood =  713.90317  
Iteration 2:   log likelihood =  713.98024  
Iteration 3:   log likelihood =  713.98037  
Iteration 4:   log likelihood =  713.98037  


Analyzing the exogenous comparison model...

Table: Estimation Results
----------------------------------------------------------------
                              Model EX             Model EN    
----------------------------------------------------------------
Dep.var: y                                                      
Constant                  0.475***  (0.017)    0.631***  (0.032)
x1                        0.215***  (0.019)    0.186***  (0.031)
x2                        0.089***  (0.021)    0.132***  (0.033)
z1                       -0.355***  (0.022)   -0.747***  (0.111)
----------------------------------------------------------------
Dep.var: ln(sigmau²)                                            
Constant                 -3.786***  (0.602)   -7.096***  (0.828)
z2                      -19.715    (10.599)    8.207***  (1.468)
----------------------------------------------------------------
Dep.var: ln(sigmav²)                                            
Constant                 -4.236***  (0.073)                    
----------------------------------------------------------------
Dep.var: ln(sigmaw²)                                            
Constant                                      -4.819***  (0.177)
----------------------------------------------------------------
eta1 (z1)                                      0.457***  (0.114)
eta2 (z2)                                      0.664***  (0.057)
----------------------------------------------------------------
eta Endogeneity Test                          X2=155.15  p=0.000
----------------------------------------------------------------
Observations                    500                  500        
Log Likelihood                 342.86               713.98      
Mean Prod Efficiency           0.9821               0.9152      
Median Prod Efficiency         0.9946               0.9364      
----------------------------------------------------------------
Notes: Standard errors are in parentheses. Asterisks indicate
significance at the 0.1% (***), 1% (**) and 5% (*) levels.
----------------------------------------------------------------

Tags: None

Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#2

04 Jul 2015, 19:36

Special thanks to the Amazing Kit Baum, -sfkk- version 1.0.1 is now available on SSC.

This version fixes a default matsize related bug that caused -sfkk- help examples to abort with the "could not find feasible values r(491);" error message. This version also includes a couple of minor updates such as the header option now shows if constants are included in the model or not.
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#3

16 Sep 2015, 21:16

Special thanks to the Amazing Kit Baum, -sfkk- version 1.0.2 is now available on SSC.

This version fixes a minor bug that caused -sfkk- to abort with an error message when header option is specified with no variables in Uhet other than the constant. An example of such a model is included in the help file with this version.

Last edited by Mustafa Ugur Karakaplan; 16 Sep 2015, 21:18.
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#4

12 Jun 2016, 11:15

Special thanks to the Amazing Kit Baum, -sfkk- version 1.0.4 is now available on SSC.

You can update your -sfkk- package from SSC by entering the following command in Stata:

Code:

ssc install sfkk, replace

You can check the version of your -sfkk- package by entering the following command in Stata:

Code:

sfkk, ver
Comment
Sabrina Ruberto

Join Date: Jul 2016

Posts: 1
#5

20 Jul 2016, 01:00

Dear Dr. Karakaplan,
First let me thank you and your coauthors for your work, that is particularly relevant for my current research!
I’m estimating a SF assuming that there is an endogenous variable (durat) affecting inefficiency (u). My main interest is assessing the impact of such a variable on the inefficiency of firms
I’m using the command
sfkk o_ric i_empl i_kap i_rawm if year==2007 , prod u(durat) en(durat) i(iv_durat) delve nicely header beep compare

and I obtain

Model EX Model EN
none none
Dep.var: o_ric
Constant 2.748*** (0.048) 2.821*** (0.038)
i_empl 0.353*** (0.008) 0.359*** (0.010)
i_kap 0.085*** (0.005) 0.086*** (0.006)
i_rawm 0.521*** (0.004) 0.515*** (0.006)
Dep.var: ln(sigmau2)
Constant -2.987*** (0.287) -1.701*** (0.125)
durat -0.080** (0.028) -0.142*** (0.014)
Dep.var: ln(sigmav2)
Constant -2.159*** (0.033)
Dep.var: ln(sigmaw2)
Constant -2.276*** (0.032)
eta1 (durat) -0.008*** (0.001)
----------------------------------------------------------------
eta Endogeneity Test X2=116.15 p=0.000
----------------------------------------------------------------
Observations 4278 4268
Log Likelihood -1602.98 -1.7e+04
Mean Prod Efficiency 0.8949 0.8546
Median Prod Efficiency 0.8980 0.8659
----------------------------------------------------------------
Notes: Standard errors are in parentheses. Asterisks indicate
significance at the 0.1% (***), 1% (**) and 5% (*) levels.
----------------------------------------------------------------

If I well understand the output, durat seems to (negatively) affect the variance of the inefficiency,
My problem is: can I retrieve the impact on the (mean) level of u, analogously to what is possible when estimating the Battese and Coelli, 1995, model?

Thank you,

Sabrina

Last edited by Sabrina Ruberto; 20 Jul 2016, 01:04.
Comment
Nataly Martinez

Join Date: Jul 2016

Posts: 1
#6

21 Jul 2016, 03:59

Dear Sabrina, thank you for posting this question. I have the similar problem for my research and it would be great to have the way to retrieve the impact of X on the mean level of u. I hope that the Dr. Karakaplan can indicate us the way to get it"
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#7

01 Sep 2016, 22:06

Hello Sabrina and Nataly,

Thank you so much for contacting me with this question. I am so happy to see that you are finding -sfkk- useful for your research. What you are asking for is a highly requested feature that I am working on. I actually received so many emails about this request that I will make sure this feature is a part of the next released version of -sfkk-. It is possible to retrieve the impact of X on u with the current version of -sfkk- but it is not automatic. It would require 4-5 lines of coding. You may want to look at my SEJ paper about how to calculate the marginal effects of inefficiency variables on u.

http://onlinelibrary.wiley.com/doi/1...12029/abstract
Gronberg, T. J., Jansen, D. W., Karakaplan, M. U. and Taylor, L. L. (2015), School district consolidation: Market concentration and the scale-efficiency tradeoff. Southern Economic Journal, 82: 580–597. doi: 10.1002/soej.12029

The regression results on Sabrina's post is hard to read so I am copying below the image she sent to me in an email. In her model, durat variable is endogenous as eta Endogeneity Test indicates the endogeneity. In Model EN, durat's effect on inefficiency is larger and significant in absolute terms compared to Model EX. By using the efficiency(effvar[, replace]) option of -sfkk-, firm specific efficiency variables can be created for Model EN and Model EX. You can check the help file of -sfkk- and the examples there to learn more about "efficiency" option. Once you create the efficiency variables, you would need to write 3-4 lines of code to retrieve the impact of durat on the mean level of u.
Comment
Emal Jan

Join Date: Dec 2016

Posts: 2
#8

16 Dec 2016, 20:41

Dear Dr. Karakaplan,

First of all, thank you so much for this great work, it makes it very easy to address the most difficult problem in stochastic frontier models. In estimation of stochastic frontier, i have the following questions, i would be very very grateful to you if you could answer:

1. i have 7 variables in the frontier, out of which i think 6 are endogenous (test also confirms it) so i have 6 IV's. The command works fine so as long as i have up to three endogenous variables and therefore three IVs, but it wont support if i add the fourth, fifth and sixth endogenous variables and their respective IVs and gives me an error (initial values are not feasible). I tried to use delve and (or) initial values but still getting the same error messages. I tried looking at my data and IV's and how they are constructed again and again, but the error won't go away. The Exogenous model with the same specification works fine too. Have you tried sfkk with 4+ Endogenous variables?
2. If estimating translog, are the square terms and intraction terms also endogenous for endogenous variables? i.e. should i use the squared of IVs and interaction of IV too-just like other variables in frontier?

Thank you very much in advance.
Emal Jan
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#9

17 Dec 2016, 18:15

Hello Emal,
Thank you very much for the questions. I am glad that you use sfkk for your research. To answer your questions:

1- Older versions of -sfkk- had no limit for the number of endogenous variables. The most recent version limits the number to 3 due to some core changes in the program. The next update will remove this limit to allow more endogenous variables. For now, you can use the previous version of sfkk. If you cannot find it, please send me an email and I will send you the package.

2- There are two sfkk options (exogenous and leaveout) that you can use to deal with endogenous square and interaction terms. For example if x and x2 (x square) are endogenous, you can write: ....., en(x) i(xiv) leave(x2) in your options to instrument for the main endogenous variable and leave the square (or interactions) out of the included instruments list. In the help file, you can find examples of these two options.

Best,
Mustafa
Comment
Emal Jan

Join Date: Dec 2016

Posts: 2
#10

18 Dec 2016, 08:12

Dear Dr. Karakaplan,

Thank you very much for your prompt response, i really really appreciate your support.

When will sfkk be updated to allow for more endogenous variables? Really looking forward to it. My research depends so much on it. Yes please, I will send you an email to request the older version of the SFKK, and will be looking forward to another update of the sfkk.

Thank you very much again!
My best regards,
Emal
Comment
maria ago

Join Date: Oct 2017

Posts: 6
#11

30 Oct 2017, 02:42

Dear Dr. Karakaplan,
First I wish to sincerely thank you for your work! I would be very grateful if you could answer the following question
To estimate my stochastic production frontier, I run:
sfkk o_Y i_L i_K, prod u(x1 x2 x3) en(x1) i(iv_x1) delve nicely header compare

According to my results, the x1 coefficient is negative, thus x1 seems (negatively) affecting the variance of the inefficiency.

Since the inefficiency u follows a half-normal distribution E(u)=sqrt(2 variance)/sqrt(pi), may I conclude that x1 is negatively affecting also the mean level of u?
Please consider that I'm not interested in the magnitude of such an effect, only in the sign.

Thank you very much in advance.
Maria
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#12

02 Dec 2017, 02:28

Hello Maria,

Your interpretation of the sign of x1 on the mean level of u seems correct.

Best,
Mustafa
Comment
Hayat Ahmadzai

Join Date: Aug 2018

Posts: 3
#13

02 Aug 2018, 09:04

Dear Dr Mustafa Karakaplan,

Firstly, I would like to congratulate you for your great achievement on introducing the sfkk in Stata and your wonderful paper “handling endogeneity in SFA models”.

I am currently using your method. I had a few quick question and I would really appreciate your help. So here goes the question:

I estimated my model and here are the results of the endogeneity test:

eta1(cdi) -0.380*** (0.076)
eta Endogeneity Test chi2=25.13 (0.000)

My questions:
1). What does eta1(cdi) mean? How do you interpreted the coefficient, sign, and the significance of the coefficient? Please note that "cdi” is my endogenous variable
2). Referring to your paper ( Karakaplan and Kutlu (2017) from the Economic bulletins), the endogeneity test is a test of the “component of the "eta η” term, I read the paper but I still can’t figure it out, what exactly are those components? Basically, I would like to describe the test in my paper in details to make it easier for the readers.
3). Based on the results above, I concluded that the η test detects endogeneity in the model and that correction is necessary, Is that correct? Does the null hypothesis state that: endogeneity is not detected, thus a rejection means endogeneity is detected? Am I right?

4). For now sfkk only allows the half-normal distributions and NOT truncated-normal yet? True?

I would highly appreciate your help and guidance.

Many thanks and best wishes,
Hayat
--
Comment
Mustafa Ugur Karakaplan

Join Date: Jun 2015

Posts: 26
#14

02 Aug 2018, 15:46

Hello Hayat,

I am glad that you find my methodology and Stata command useful for your research. Below are some of my papers that would help you understand what eta endogeneity test results mean. You'll find the description of the eta tests as well as plenty of examples in these papers:

Karakaplan, Mustafa U. and Kutlu, Levent (2017) "Handling Endogeneity in Stochastic Frontier Analysis." Economics Bulletin
Karakaplan, Mustafa U. (2017) "Fitting Endogenous Stochastic Frontier Models in Stata." The Stata Journal
Karakaplan, Mustafa U. and Kutlu, Levent (2018) "School District Consolidation Policies: Endogenous Cost Inefficiency and Saving Reversals." Empirical Economics
Karakaplan, Mustafa U. and Kutlu, Levent (2017) "Endogeneity in Panel Stochastic Frontier Models." Applied Economics

Answers to your questions:
1) eta1 (cdi) is a part of eta test outlined in Karakaplan and Kutlu (2017). eta enters the log likelihood function in equation 8 as the coefficient in front of the correction term. "The significance of the k'th component of eta indicates that x_ik (the k'th component of x_i) and v_i are correlated. Hence, a particular variable of interest is endogenous if the corresponding component of eta term is significant." An individual eta indicates the explanatory power of the correction term specifically applied to that particular variable with the IVs you chose. Your individual eta term of cdi is negative and significant at the 0.1% level. The sign indicates in which direction the correction is applied to the model and the significance indicates if the specific correction term is significantly identifying the endogeneity in the model.
2) The answer above answers this question too.
3) Yes, you are right. Your eta endogeneity test results show that there is endogeneity in your model and correction for the endogeneity of cdi is needed. In order to see in detail how well your excluded IVs are explaining cdi, you can remove the "nicely" option from your command line.
4) Yes. sfkk only allows the half-normal distribution. You can modify sfkk program itself to include truncated-normal but I would not expect a major change in your results if you change the distribution. There is research about that.

Best,
Mustafa

Last edited by Mustafa Ugur Karakaplan; 02 Aug 2018, 15:48.
Comment
Hayat Ahmadzai

Join Date: Aug 2018

Posts: 3
#15

02 Aug 2018, 17:38

Dear Dr Musafa,

Thank you very much for sharing the papers and your detailed answers. This helps a lot.
Best wishes.
Hayat
Comment

Announcement

New on SSC: -sfkk- module to estimate endogenous stochastic frontier models in the style of Karakaplan and Kutlu (2015)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment