Regressing estimates of efficiency obtained using Data Envelopment Analysis (DEA) on environmental variables (panel data)

Valentina Demchuk

Join Date: Jul 2019

Posts: 3
#1

Regressing estimates of efficiency obtained using Data Envelopment Analysis (DEA) on environmental variables (panel data)

11 May 2020, 11:09

Hello all,

For my thesis I am trying to estimate how various environmental factors (e.g. GDP per capita, interest rates etc) in a country affect the efficiency of insurance companies. I have an unbalanced panel of 1209 obs for 145 DMUs over the span of 12 years.

Can anyone suggest a function/coding to carry it out? I would like to use the Simar and Wilson two-stage efficiency analysis described in Wilson, Simar (2007), but the simarwilson function in Stata is only suitable for cross-sectional data, and my knowledge of Stata is insufficient to code it on my own. I'm not sure either if when using the dea command I should calculate efficiency scores for each year separately or for the entire sample.
Any help would be appreciated.

Kind regards
Tags: None
Harald Tauchmann

Join Date: Aug 2017

Posts: 30
#2

01 Jun 2022, 08:50

Dear Valentia:
Applying the Simar & Wilson (2007) approach to panel data appears tob e difficult, not only in terms of coding but first of all conceptually. Simar & Wilson (2007) view DUMs as independent observational units, which is violated in panel data - and which in turn may motivate one to look for a ‘panel version’ of the estimator. (i) The first question is whether to run the DEA on the pooled sample or separately on each cross section. The key thing one has to think about whether one wants to assume that the production possibility frontier is the same for the entire period considered or whether it changes over time. My impression is that most applied researches don’t want to make this assumption and estimate different frontiers for each period. This may however be regarded as rather inefficient approach since the information that the frontier in t+1 will most likely not be too different from the frontier in t is not used. (ii) One has to think about whether to assume the unobserved heterogeneity to be 'random' [uncorrelated with the regressors] or to be 'fixed' [correlation with regressors allowed]. In the random effects case things may get tricky since one has to integrate this into the Simar-Wilson bootstrap. I.e. one would have to sample from truncated multivariate normal distribution. Even worse, from an applied Stata perspective, to my knowledge Stata does not provide a routine for estimating the normal truncated model with random effects random effects. Technically fixed effects would be much simpler since one could – under the assumption that there is no error correlation left once DMU fixed effects are included just use existing commands (teradial, simarwilson). Yet, it is well known that ‘brute force’ fixed effects (i.e. just including a full set of DMU dummies "i.dmu") generates an incidental parameters problem, i.e. the truncated regression estimator is no longer consistent for N -> inf. One may well debate how relevant asymptotic properties are for applied work. Yet it is well known that using ‘brute force’ fixed effects may result in substantial biases in finite samples and may also generate technical problems such as convergence issues.
Best wishes,
Harald
PS: using the (user wiritten) command teradial for clacualting DEA efficiency scores might be more efficient than using dea.
Comment

Announcement

Regressing estimates of efficiency obtained using Data Envelopment Analysis (DEA) on environmental variables (panel data)

Comment