Parameters Estimation of Weibull Distribution Based on Fuzzy Data Using Neural Network

In this article, an estimation procedure to estimate the parameters of a two-parameter Weibull distribution has been discussed. The nature of the data is considered as imprecise and is in the form fuzzy numbers. Artificial Neural Network has been used in parameter estimation of Weibull distribution. The network architecture is determined experimentally based on RMSE. Other classical methods of parameter estimation such as method of moments, maximum likelihood estimation and Bayesian estimation are also discussed. Performances of each of these methods are compared using mean and standard deviation of the estimates based on the simulated data for a various range of parameter values


Introduction
Weibull distributions have several desirable properties and these properties have nice physical interpretations. Weibull distribution generalizes exponential and Rayleigh distribution. It was W. Weibull who first introduced Weibull distribution way back in 1937 for estimating reliability and life testing of machinery. In literature, we can find Weibull distribution with three as well as two parameters. A random variable' X ' following Weibull distribution with two parameters ( ) , α β has following probability density function (pdf) and cumulative distribution function (cdf) ( ) ( ) 1 : , α β > are scale and shape parameters. The Weibull distribution is generally used in reliability modeling and life testing. The estimation of its parameters has been discussed by Qiao & Tsokos [1]. Graphical approach is generally used for its simplicity and speed. Balakrishnan & Kateri [2] have proposed a very simple and easily applicable graphical approach for parameter estimation of Weibull distribution. This approach easily shows the existence of maximum likelihood estimates (MLE). However, graphical approaches are subjected to probability of error in measurement.
Type-I and Type-II censoring is another method under which parameter estimation of Weibull distribution is discussed by several articles. Censored data usually arise in circumstances like a disease or a lack of success is only partially observed, because collection of information is done at certain examination times. Watkins [3] presented maximum likelihood estimation (MLE) approach for Weibull distribution when the nature of data for analysis contains both times to failure and censored times. Marks [4] introduced an effective iterative procedure for the estimation. Banerjee & Kundu [5] in their study, they have discussed a hybrid censoring method that is a combination of Type-I and Type-II censoring schemes. Their study presented an approach of obtaining estimates of unknown parameters iteratively. Zaindin & Ammar [6] considered modified Weibull distribution based on Type II censored data. They deal with the problem of estimating the parameters of this distribution based on maximum likelihood and traditional least square techniques.
When the explicit forms of the estimates are not readily available by directly solving likelihood functions, then the need for iterative procedure arises. Among the frequently used iterative procedures expectation maximization (EM) algorithm is very often used as far as parameter estimation of Weibull distribution is considered. Another method that is commonly used is Newton Raphson's (NR), based on numerical methods. Panahi & Saeid [7] discussed estimation of the Weibull distribution based on type-II censored samples. When we have prior knowledge about the parameter distribution Bayesian estimation procedure can be used. Nandi & Dewan [8] have considered the problem of estimation of the parameters of the Marshall Olkin Bivariate Weibull distribution in the presence of random censoring. They used EM algorithm instead of maximum likelihood estimators as the parameters cannot be articulated in a closed form by the use of MLE. Balakrishnan & Mitra [9] considered truncation and right censoring data and EM algorithm NR methods were applied to estimate the model parameters. But all the iterative methods have shortcoming in terms of stopping criteria, time complexity and tendency to convergence locally.
In all the above inferential methods, data considered was precise in nature. But in real life situation that it is sometimes difficult to measure and record precise data. In many occasions due to unexpected errors, assignable causes or due to machine errors precise data can't be recorded. Pak et al. [10] in their study considered the problem in which parameters of Weibull distribution are estimated based on the fact that collected data is not precise and are considered in the form of fuzzy numbers. Maximum likelihood estimates (MLEs) of the parameters were obtained by Newton-Raphson (NR) and Expectation Maximization (EM) algorithms. Bayes estimates of the unknown parameters were obtained by assumption of Gamma priors. Gertner & Zhu [11] Bayesian estimators based on two kinds of extension to the fuzzy likelihood functions for a given fuzzy prior. Thierry [12] presented a method of estimating the parameters of a parametric statistical model when nature of data is fuzzy and are assumed to be related to underlying crisp realizations of a random sample.
When we discuss about iterative procedures for parameter estimation, approaches based on artificial neural network do find its place. Abbasi et al. [13] have proposed an artificial neural network based approach for estimating parameters of Burr XII distribution. Likas [14] proposed a method for density estimation based on neural network approach. Parameter estimation of K-distribution based on Method of moments, Method of maximum likelihood estimation (MLE) and neural network respectively has been discussed by Iskander et al. [15] and Wachowiak et al. [16]. We consider the problem which is different from censoring and truncation when the data for Weibull random variable is fuzzy in nature and then develop artificial neural network based approaches to estimate the parameters. Also compared results based on neural network approach with NR, EM algorithm, Bayesian approach.

Fuzzy sets and probability
Fuzzy set theory describes a calculus for the uncertainty associated with classification, or what is called as "imprecision".
But it is quite possible that both uncertainty and imprecision can be present in the same problem, Zadeh [17] and Singpurwalla & Booker [18] in their study stated that "Probability must be used in concert with fuzzy logic to enhance its effectiveness. In this perspective, probability theory and fuzzy logic are complementary rather than competitive". Let ( ) , , n R A P denote a probability space where A is the σ field of Borel sets in , n R P is the measure satisfying 0 1 P ≤ ≤ over . n R Then, the probability of an event A that is fuzzy is nature can be defined over n R by:

Fuzzy data and likelihood function
Suppose that ( ) Now it is considered that x is observed imprecisely and a partial observation of x is obtainable that is in the form of a fuzzy subset x  The corresponding borel measurable membership function is ( ). A realization x is drown from X ; The observer records partial information of in the form of a possibility distribution ( ).
x x µ  It is to be noted that, in this approach model (1) is considered as random experiment where as model (2) implies congregation knowledge about x and modeling this partial information in a possibility distribution.

Likelihood and log likelihood function
Suppose following two intervals determine occurrence of each event ' ' as The occurrences of events in such a manner may be represented as a trapezoidal fuzzy number Once x  is given, and assuming its membership function to be the Borel measurable, we may calculate its probability according to Zadeh's description of the probability of a fuzzy event. By using the Eq. (3), the observed-data likelihood function can then be presented as Since the data vector x is a realization of a random vector X drawn independent identically distributed and assuming the decomposable joint membership function as in Eq. (6) the likelihood function can be presented as and the corresponding log-likelihood is

Parameter estimation of a distribution
The theory of parameter estimation is huge and there are many methods available to estimate parameters of a particular distribution. In common practice, the methods which are used in parameter estimation of a distribution include

Method of maximum likelihood estimation (MLE)
MLE is a method of parameter estimation very frequently used in statistics and is a crucial tool for many statistical modeling procedures [19]. Let ( ) 1 2 , ....X n X X X = be a random sample drawn independently and identically (iid) from a probability density function ( ) is the log-likelihood function. MLE's can be obtained by maximizing the log likelihood function with respect to the desired parameters. But in many cases close functional forms of the estimators may not be obtained, in that case the procedure that generally adopted are iterative in nature which includes in general numerical methods like Newton Raphson's method, very well known method which is used in most of the cases is Expectation maximization algorithm (EM). Bayesian method is another approach of parameter estimation and some of these methods will be discussed in the following section.

MLE of parameters of weibull distribution when nature of the data is not precise
Method of maximum likelihood estimation maximizes the likelihood function with respect to the parameter. Maximum likelihood method is most commonly used method in parameter estimation and is considered as most robust, which yield estimators with good statistical properties. The MLE for the pdf in Eq. (1) i.e. the parameters α and β can be obtained by partially differentiating log-likelihood function (8) with respect to the parameters and then equating to zero. The partial derivative of (8) with respect to α and β are: From the above expressions (9, 10) it can be observed that there are no closed forms for the solution are possible, an iterative approach can be useful to find the solution for α and In the subsequent section, we are going to discuss some of the β methods applied to obtain the solution of the parameters.

Graphical approach
In this approach, we are mainly looking to get a solution by the help of plots. Form equation (10) λ can be written as:

EM algorithm
The Expectation-Maximization (EM) algorithm is a broadly applied method to obtain the maximum likelihood estimators iteratively. An iteration of the EM algorithm consists of two stepscalled the Expectation step or the E-step and the Maximization step or the M-step. From equation (4) The M-step requires solving equations (18) and (19)  ; ; Where,  ,

Bayesian estimation
While we have discussed about the use of the EM algorithm for finding MLEs in a frequentist framework, EM algorithm can also be equally applied to find the nature of the posterior It is to be noted that equation (22) can't be solved analytically to obtain the expectation. Tierney & Joseph's [19] approximation procedure has been adopted for computation of Bayes estimate for ( )

Neural network
Neural network is an iterative process in which we try to optimize the target output by minimizing error. The literature of neural network is vast and has been used for the parameter estimation of distributions by Wachowiak et al. [16] and Abbasi et al. [13]. In the present study, feed-forward neural network has been applied to estimate the parameters of Weibull distribution. An important aspect of neural network is determining its network architecture i.e. to determine the number of hidden layers units to each layer to be used. Teoh et al. [20] used singular value decomposition and presented the impact of increasing the number of neurons in the hidden layer of feed forward neural network architecture. The point of using neural network in parameters of Weibull distribution is that neural network can easily learn the input-output relationship without having any prior knowledge of the functional relationship between input and output [21].
The th r order moment for the probability density function presented in Eq. (1) is given by Where ( ) Γ ⋅ represents gamma function.
Form weak law of large numbers it is known that, Where 'P' denotes convergence in probability. Thus, if we are concerned about getting the estimates of population moments, the method of moments provides consistent and unbiased estimators. By equating sample moments with that of population moments we get the below equations ( )  (26) shows that α and β are related to first and second order moments and hence sample moments are used as input to train the network. It is very much important to decide network architecture which involves the determination of the number hidden layer to be used and nodes in each of the hidden layers. Determination of the network architecture depends on the complexity of the problem at hand. An ideal network corresponding to input output data set is determined experimentally based on the RMSE. Moreover, the objective of training neural networks is not limited to learn an exact depiction of the training data, rather to build a model based on the training set which can generalize the new input data set that we usually call test data. In practice if the feed forward neural network is over-fit to the noise on the training data, it memorizes the training data and gives poor generalization to a test data set. The network architecture used in the present study is not fixed, based on the sample at hand different network structure has been used for training. Matlab Math Works and R 3.0.3 are used for the analysis purpose. The generated data is randomly divided into three parts namely training, testing and validation set. The percentages of data being used are 50, 30, and 20. The problem of over fitting is avoided by analyzing the neural network performance after training [22,23].

Data generation and analysis
Simulation allows us to compare different analytical techniques. Monte Carlo simulation has been used to simulate data of different sample sizes for Weibull random variable and are then fuzzyfied using the following membership functions.
Samples of different sizes were generated for the purpose of comparison. For each sample all the methods were applied to compute estimates and entire process is replicated a reasonably large number of times to control all sorts of variations.

Results and Discussion
The results of the above discussed methods are compared using the averages and the mean square error (MSE) of The average values and mean squared errors (MSE) of the estimates over 1000 replications are presented in Table 1 and  Table 2.  The estimates for the parameters and are computed using fuzzy data. Estimation method includes maximum likelihood method (via NR and EM algorithms), artificial neural network (ANN) and a Bayesian procedure. For computing the Bayes estimates information about the prior distribution of the parameters is needed. It is assumed that and follow Gamma (a, b) and Gamma (c, d) priors respectively. To carry out the comparison it is further assumed that the priors distribution of the parameters are non-informative, and a = b = c = d = 0. The performance of the ANN network structure used for the analysis. Since in this study different sizes have been used, we did not limit ourselves to particular network architecture. The type of network used is Feed-forward backpropagation neural network and the number of hidden units is determined using Teoh et. al. (2006) Inputs to the network were the 1st and 2nd order moments of the generated samples; randomly selected 20% of the samples were used in testing.

Conclusion
Based on complete and censored data some work on Parameter estimation of Weibull distribution has been done previously. But, generally it is assumed that the data at hand are exact numbers. However, it may happen that some of the collected data are imprecise and are represented in the form of fuzzy numbers. Therefore, we need suitable methodology to handle these data as well. This paper shows that the traditional methods usually used in parameter estimation based on methods of moments or maximum likelihood function did not come out with the explicit solution, when nature of the data is considered as fuzzy. Instead iterative methods were used to estimate the parameters. Iterative methods used involve complex calculation and its computation is tedious. The proposed approach based on neural network to estimate parameters of Weibull distribution showed a significant improvement. It has been found that apart from the small samples neural network performance was better compared to the other methods. The main advantage of using neural network is that neural network does not require complex theoretical derivations and tedious computations as well.

Ethics
This is not applicable since the submitted paper is not related to field work, only based on mathematical assumptions in the field of Inferential Statistics.

Data accessibility
We have written our manuscript in the field of statistics and also done comparisons along with other existing schemes which are given in references. Thus, there is no need for data accessibility.