Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convergence problem with Stochastic Frontier Analysis (SFA) / Maximum likelihood

    Dear all,

    I'm trying to run a Stochastic Frontier Analysis (SFA) on a panel dataset of around 35,152 observa-tions (different banks over time span of 18 years). My target is to use SFA to estimate different (in)efficiency indicators. (To those not familiar to SFA – like me 2 months ago -, in SFA the error term is decomposed into an idiosyncratic error and an inefficiency term).

    To my knowledge, there are two different commands to conduct such an analysis: the built-in frontier/xtfrontier commands and the user-written packages sfcross/sfpanel. Unfortunately, I’m unable to post a sample of my dataset because the data are confidential. I hope this is not too big of a problem.

    My preferred (simple) model looks like the following:
    Code:
    frontier ln_TOC ln_a1 ln_a2 ln_y1 ln_y2 ln_y3 ln_y4 ln_z t t#t ln_y1#t ln_y2#t ln_y3#t ln_y4#t ln_a1#t ln_a2#t, cost
    
    where :
    TOC = total cost
    a1 = Input price 1 / Input price 3
    a2 = Input price 2 / Input price 3
    y1 = Output 1
    y2 = Output 2
    y3 = Output 3
    y4 = Output 4
    z = Equity (controll variable)
    t = Time trend
    The prefix “ln_” shows that the variables are in log. terms. To simply the following commands the names of the covariates are stored in a global ${SFACD_stata}.

    Now to my problem : When I estimate the model where the distribution of the inefficiency term is half-normal, the estimation works fine. However, I also want to study what determines inefficiency and thus I repeated the estimation with a truncated normal distribution typing the following:
    Code:
    frontier lnTOC ${SFACD_stata}, cost distribution(tnormal)
    or equivalently:

    Code:
    sfcross lnTOC ${SFACD_stata}, cost distribution(tnormal)
    The built-in command gives me the error message

    Code:
    initial values not feasible
    r(1400);
    and does not even start with the ML iterations.

    I could find a way around it by determining the initial values by myself:

    Code:
    cap: matrix drop b0
    regress  lnTOC ${SFACD_stata}
    matrix b0 = e(b), ln(e(rmse)^2) , .1 , 0
    frontier lnTOC ${SFACD_stata}  , cost distribution(tnormal) ufrom(b0)
    However, now both commands run into an endless iteration process issuing the “backed up” message. I exited the iteration at the point where no progress on the ML was made. The result is the following:

    Code:
    sfcross lnTOC ${SFACD_stata}, cost distribution(tnormal)
    
    Stoc. frontier normal/tnormal model                  Number of obs =     35152
                                                         Wald chi2(17)  =  1.41e+06
                                                         Prob > chi2   =    0.0000
    
    Log likelihood = -7436.3600
    ------------------------------------------------------------------------------
           lnTOC |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    Frontier     |
           ln_w1 |   .0093599   .0056242     1.66   0.096    -.0016634    .0203831
           ln_w2 |  -.0832457   .0154317    -5.39   0.000    -.1134913       -.053
           ln_w3 |  -.6503502   .0125998   -51.62   0.000    -.6750454   -.6256551
           ln_y1 |   .0969384   .0029174    33.23   0.000     .0912204    .1026564
           ln_y2 |   .5061618   .0048146   105.13   0.000     .4967255    .5155982
           ln_y3 |   .0362708   .0016303    22.25   0.000     .0330754    .0394661
           ln_y4 |  -.0169388   .0020185    -8.39   0.000    -.0208949   -.0129826
            ln_z |   .3901585    .004274    91.29   0.000     .3817817    .3985353
               t |   .0106885   .0059175     1.81   0.071    -.0009096    .0222866
                 |
         c.t#c.t |  -.0008177   .0001148    -7.12   0.000    -.0010427   -.0005927
                 |
     c.ln_y1#c.t |  -.0002508   .0002763    -0.91   0.364    -.0007923    .0002907
                 |
     c.ln_y2#c.t |  -.0029398   .0003671    -8.01   0.000    -.0036593   -.0022203
                 |
     c.ln_y3#c.t |  -.0000199   .0001442    -0.14   0.890    -.0003026    .0002628
                 |
     c.ln_y4#c.t |    .000443    .000195     2.27   0.023     .0000609    .0008251
                 |
     c.ln_w1#c.t |   .0027755   .0005455     5.09   0.000     .0017063    .0038446
                 |
     c.ln_w2#c.t |  -.0058779   .0014513    -4.05   0.000    -.0087223   -.0030334
                 |
     c.ln_w3#c.t |  -.0085252   .0009066    -9.40   0.000    -.0103022   -.0067483
                 |
           _cons |  -1.466262   .0636535   -23.04   0.000     -1.59102   -1.341503
    -------------+----------------------------------------------------------------
    Mu           |
           _cons |  -410.5442   45.20626    -9.08   0.000    -499.1468   -321.9416
    -------------+----------------------------------------------------------------
    Usigma       |
           _cons |   4.839221   .1100933    43.96   0.000     4.623442       5.055
    -------------+----------------------------------------------------------------
    Vsigma       |
           _cons |  -3.902038   .0167783  -232.56   0.000    -3.934923   -3.869153
    -------------+----------------------------------------------------------------
         sigma_u |   11.24148    .618806    18.17   0.000     10.09178    12.52216
         sigma_v |   .1421292   .0011923   119.20   0.000     .1398113    .1444855
          lambda |   79.09341   .6188649   127.80   0.000     77.88046    80.30636
    ------------------------------------------------------------------------------
    H0: No inefficiency component:            z =  75.946          Prob>=z = 0.000
    I cannot see a coefficient of the covariates that is outlandishly high/low, which would point toward a problem with my model. However, the estimated mean of the inefficiency term ap-pears to be large/low and also takes a surprising sign (negative instead of positive). I need to estimate a truncated normal model to use the emean() option in order to find the determinants of inefficiency.

    As I stated before, I’m relatively new to SFA. I would appreciate any help from you.

    Thanks in advance (and sorry for the long text),

    Sebastian

  • #2
    Hi Sebastian.

    I just saw your post and I am surprised you have no answer. I have two comments on your post. First, you do not need to estimate a truncated normal especification to find the determinants. You can estimate a heterokcesdastic especification for var_u and get the determinants. Second, try to use the command sfmodel which use a more explicit ml structure. The Kumbhakar et al. (2015) A Practitioner's Guide to Stochastic Frontier Analysis Using Stata has been very useful to my SFA estimations.

    Best,

    Marcelo

    Comment

    Working...
    X