Inference in Stochastic Frontier Models Based on Asymmetry

Stochastic frontier analysis (SFA) is often employed to study the production functions. The structure of errors is the main difference between the standard regression analysis and the stochastic frontier models; in the SFA, an independent random term with positive value is added to the usual white noise error. Conventionally, the parameters involved in the SFA are estimated and then, the convenience of using this model is tested. The authors propose to study, previously, the residuals in order to check the capacity of assuming a stochastic frontier model and then, if applicable, to estimate the parameters. With this goal, several non-parametric hypothesis testing are explored. Monte Carlo simulations suggest that, under the usual conventions, simple inference based on skewness provided competitive results and proved to be robust in the presence of outliers. Standard methods based on the maximum-likelihood obtained really poor results; they only were reasonable when both the sample size and the inefficiency term were large.


Introduction
The main objective of the production frontier or Stochastic Frontier Analysis (SFA) is the study of the production function. In spite of the study of the empirical estimation of production functions have begun long before, the contemporary formulation of the problem was proposed by Aigner, et al. [1] and Meeusen, et al. [1]. Formally, the following general model is considered is the vector of regressors, ( ) g ⋅ is the deterministic production function and ε is the deviation from the frontier production. In the SFA model ε is considered to be , u ε = − ò where ò is the the usual white noise and u is a non-negative quantity which represents the {technical inefficiency}. Conventionally, both quantities are assumed to be independently drawn. Although different semiparametric [3,4] and non-parametric models have been proposed for the production function [5], the translog formulation is still the most frequent, It is linear in parameters and the variables ( ) are the logarithm transformations of the originals. This specification has been shown mis-specify cost and production functions in many cases [6]; of course, a mis-specified response function is almost sure to affect estimated residuals. Since, the original normal-half normal and normal-exponential models were introduced by Aigner, et al. [1], different parametric models have been considered for the pair { } , : u ò usual procedure estimates -when it is possible-the specific frontier parameters and then computes the p-values. When the parameters cannot be computed, it is assumed that the stochastic frontier model (SFM) is not adequate. In spite of the huge number of published works dealing with frontier models and related problems, hypothesis testing has been less considered in the specialized literature: via Monte Carlo simulations, Coelli [11] concluded that both the likelihood-ratio and Wald based tests obtained incorrect size while method based on the third moment of the ordinary square error regression obtained an adequate size; recently, Simar & Wilson [14] proposed to use bootstrap method in order to develop inferences for the parameters involved in the SFM; their method provides useful information about inefficiency and model parameters irrespective of whether residuals have the skewness in the desired direction; Kuosmanen & Fosgerau [15] proposed to evaluate the SFM validity from the skewness of the regression residuals.
Once the regression parameters of equation (1) .
and the inefficiency term is used in order to propose a slight modification on the MOM procedure which, via Monte Carlo simulations, is proved to be more robust than the rest of the studied methods in the presence of outliers.
Rest of the paper is organized as follows: section 2 is devoted to point out some brief remarks about the available software. Theoretical aspects of the proposed measures are exposed in section 3. Performance of four different tests are studied via Monte Carlo simulations in section 4; three different scenarios were considered: normal-half normal, contaminated normalhalf normal and the well-known WHO data structure with normal-half normal residuals. In section 5 the WHO data are directly analyzed and, finally, in section 6, the obtained results are discussed.

Software
As in the rest of scientific fields, statistical advances are closely related with the computational developing. In addition, the popularization of a particular statistical procedure strongly depends on the characteristic of the available software. Currently, there exist several programs and routines for stochastic frontier and efficiency analysis, the most general being LIMDEP/NLOGIT and Stata. In addition, the freeware program, FRONTIER 4.1, developed by Tim Coelli (it can be easily found on the web) can also be used for a small range of stochastic frontier models. Unfortunately, FRONTIER is quite old; however, a version of FRONTIER 4.1 in the free environment of R has been developed by Arne Henningsen. In addition, Belotti et al. have written some new routines for Stata. Christopher Parmeter has written some additional code in R. LIMDEP has included an extensive package for frontier modeling since the mid 1980s. New features and models are added to LIMDEP on an ongoing basis. Recently, Wilson [16] developed the R package FEAR for computing nonparametric efficiency estimates, making inference, and testing hypotheses in frontier models. σ . Notice that, due to the fact that λ relates the variance of the perturbance from the inefficiency term ( ) u with the variance of the white noise perturbance ( , ) ò not with the total production measured; its value is not directly associated with the production inefficiency but with the global system inefficiency; i.e., the value of λ is affected by the capacity of the production function to predict the optimal production. Hence, in order to determinate the pertinence of applying an SFA, previously we must check if the available data provided enough statistical power to perform the testing,

Inference based on asymmetry
In practice, model (1) is transformed to obtain a new model in which, as usual, a zero-mean perturbance is involved, and 0 otherwise). Therefore, applying the traditional sign test, we first propose to check the contrast in (2) from Despite the dependence among elements of ℇ because is an approxiated critical region for the contrast (4) at level . α When, as usual, it is assumed that ò follows a symmetric distribution Since, in the stochastic frontier model (2) can also be tested from the contrast Besides, for residuals normally distributed, the Central Limit Theorem (CLT) and the Slutski's Lemma guarantee the following result.
is an (asymptotic) critical region for the contrast (6) at level α . It is worth to note that the asymptotic normality for the estimator  3  can be proved under weaker conditions; however, parametric models must be assumed in order to approximate its variance.

Confidence intervals
Once it has been decided that, for the studied dataset, a stochastic frontier is the appealing appropriate model (this decision can be taken even when the null (2) is not rejected; in fact, in order to estimate the involved parameter, we only the question of the "wrong" skewness deserves some discussion which was included in final section), the next step is to estimate the involved parameters and computing the respective confidence intervals. Usually, some parametric model is assumed at this point. In this section, properties to make inferences on λ for the normal-half normal model and, from Monte Carlo simulations, it is studied the behavior of the obtained expressions.
Under the normal-half normal model, it is well-known that, , and ~0, , and Ö stand for the density and the distribution functions of a standard normal law, respectively. Then, we have that  only depends on the λ values. In addition, Unfortunately, the function 1 −  cannot be directly computed and, with this goal, numerical methods must be used. Figure 1 depicts the numeric computation for the function  . Note that

G λ λ∈
As it is well-known, the value of λ can also be derived from the 3  parameter. Particularly, it holds the following relationships: can be directly estimated by using a plug-in method.

Monte carlo simulations
The practical behavior of the methodology exposed above is studied via Monte Carlo simulations. Three different scenarios have been considered. The first one is a simple normal-half normal situation; a sample with size , , N X is drawn from a standard normal distribution and then, it is computed 1 , Finally, in the third scenario, we consider a model with one dependent variable and three regressors the covariance matrix and the linear regression model structure as well as the total variance of the residuals are based on the WHO data adding a normal-half normal perturbation (complete information of this dataset is provided as Appendix).

Hypothesis testing
In this subsection, we investigate the capacity of the previous tests for detecting the presence of skewness under the three scenarios described above.    and N=1,000 the observed power does not achieve 80%. Table  2 shows the observed rejection proportions for 5,000 Monte Carlo iterations under the scenario 2 (contaminated normal-half normal). All considered test overestimated the nominal level. Logically, Shapiro-Wilks test increases the rejection proportion with the sample size; note that, in this case, the null hypothesis is not true because the residuals are normal distributed. However, test based on      Figure 2 for the contaminated normalhalf normal scenario. Because, at this scenario, it does not hold the residuals normality even for = ë 0, the Shapiro-Wilks test has not been included in this plot. The Wald test was, again, the most powerful one; remember that the rejection proportion under the null was also really high. Tests based on k S and 0 P had similar behaviour although k S was always better than 0 P . Finally, Table   3 shows the observed rejection proportions, under the null, for the third considered scenario. Tests based on 0 P , k S and S W − respected the two considered nominal levels. However, the Wald test increased the rejection proportion with the sample size, even it overcame the 50% for N=5,000and α=0.05. Figure 4 shows the observed power for the four studied tests in the third described scenario. The results were not essentially different to the ones observed in Figure 2. However, it is worth to remark that, in this scenario which is more complex than the previous ones, for 2

Confidence intervals
In order to study the quality of the estimation provided by the skewness based methods and the usual maximum-likelihood based one, we explored the coverage percentage of symmetric 95% confidence intervals built by using the usual naive bootstrap method (200 replications) for the procedures based on the probability ( ) 0 P and ( ) k S the moments . In addition, the R package frontier is used to compute the confidence interval for the maximum-likelihood method. Table 4 shows the observed results in 5,000 Monte Carlo iterations from the scenario 1. Obviously, estimations have only been computed when it was possible (final sample used is provided as % valid). With the coverage percentage, the length of the 95% confidence interval is The W estimator directly cannot be used for 2 λ lower than 1 and if 2 1 λ = sample size must be really large. Table 5 is similar to Table 4 when samples are drawn from scenario 2. The three considered tests obtained bad results in this setting. k S and Westimators achieved really poor coverage. These results were only similar to the expected ones for λ=2. In addition, the coverage percentage decreased with the sample size for the three investigated estimators. In spite of the length of the respective confidence intervals were the largest, only 0 P obtained reasonable results for most of the considered situations. Table 6 shows the observed results for the scenario 3. These results were not different to the ones obtained in Table 4. Usual estimator based on the maximum-likelihood ratio only worked for 2 2 λ ≥ and 1, 000. N ≥   and kernel density estimation for the observed residuals. Although it seems that the residuals are not drawn from a normal law Shapiro-Wilks test rejects this hypothesis with a p-value=0.001, the kernel density estimation (KDE) is far from the normal model but also far from the normal-half normal estimated from the maximum-likelihood method.

Discussion
Stochastic frontier models are conventionally used in order to study production and cost functions. The specialized literature focuses are, mainly, the estimation of the parameters of these functions by regression techniques. Parametric normalhalf normal residuals joint with the maximum-likelihood method is, probably, the most frequently employed procedure. In the voluminous literature on this topic, statistical testing of the properties of the disturbance term has attracted little attention. In this work, the authors deal with the stochastic frontier model identification from a statistical approach. Via Monte Carlo simulations, we study the performance of the Wald test computed from the R package frontier. The results were catastrophic; Wald test respects neither the nominal levels nor the coverage percentages in most of the considered situations. Moreover, in the outliers' presence, the obtained results were extremely poor. These results support previous works which achieved similar conclusions [11,14].
Similarly, to Kuosmanen & Fosgerau [15], we proposed to study the residuals expected skewness in order to make inference about the data capacity to establish a stochastic frontier model. In spite that Simar and Wilson [16] and Feng et al. [18] worked on the so-called "wrong" skewness and it seems clear that, even when the stochastic frontier model is correctly specified, data can present "wrong" skewness. We assume that, in these cases, data have not provided enough evidence to estimate the potential SFA.
The new proposed estimator, based on the expression { } 0 ε ≤  is really easy to compute, fully non-parametric and very robust, which implies being a bit conservative when there is not outlier in the sample. The analysis of the considered real dataset was really enlightening; while  3  estimator provided a really close to zero estimation,  0 N  obtained a λ estimation close to 1 (it was poor taking into account the available sample size and the power observed in the Monte Carlo study). The Wald method provides a value λ − larger than 5. The observation of the residuals distribution suggests a lack of normality in their distribution, but the adjustment to the normal-half normal model proposed by the maximum-likelihood method was even worse. It seems that the maximum-likelihood method assigns all deviation from normality to the so-called inefficiency; however, as it is wellknown, a regression model can have different problems which produce a lack of normality in the residuals distribution.
In short, stochastic frontier models frequently try to detect a slight deviation in the normality of the residuals; this deviation is assigned to the inefficiency term. However, detecting these deviations is really difficult without huge samples. In addition, the origin of these deviations cannot be the inefficiency term; in particular, there exists different causes which provoke this effect. A deep study of the residuals distribution, particularly of their symmetry can drive to obtain a good knowledge about the real nature of the model. Of course, the application of (relatively) modern techniques to analyzed the symmetry of the perturbance distributions or methods based on quantiles among many others can be useful [19,20].