Review of Fraud Detection and Churn Behavior Modeling Techniques
Ernest O Nonum1*, Chukwuedozie N Ezema2 and Inyiama C Hyacinth2
1Department of Engineering Novena University, Nigeria
2Department of Electronic and Computer Engineering, Nnamdi Azikiwe University, Nigeria
Submission: April 05, 2018; Published: June 26, 2018
*Corresponding author: Ernest O Nonum, Department of Engineering Novena University, Novena University Ogume Delta State, Nigeria, Email: email@example.com
How to cite this article: Ernest O N, Chukwuedozie N E, Inyiama C H. Review of Fraud Detection and Churn Behavior Modeling Techniques. Eng Technol Open Acc. 2018; 2(1): 555577. 10.19080/ETOAJ.2018.02.555577
This research pursues inductive/deductive approach by studying existing techniques for fraud detection and customer churn prediction. Telecommunication operators for instance store large amounts of data related with the activity of their clients. In these records exists both normal and fraudulent activity records. It is expected that the fraudulent activity records should be substantially smaller than the normal activity. If it were the other way around this type of business would be impractical due to the amount of revenue lost. Furthermore, customer churn is the focal concern of most companies which are active in industries with low switching cost. Among all industries which suffer from this issue, telecommunications industry can be considered in the top of the list with approximate annual churn rate of 30%. This means wasting the money and efforts, “it is like adding water to a leaking bucket”. In this paper, the authors present the review of past work that has been carried out by various researchers based on fraud detection and churn behavior modeling.
Keywords:Fraud Detection; Churn Behavior Modeling; Customer churn
Abbrevations: DT: Decision Trees; LR: Logistic Regression; NN: Neural Networks; NBC: Naive Bayesian Classifiers; SVM: Support Vector Machine; LDA: Linear Discriminant Analysis; SOM: Self-Organizing Maps
Since the beginning of commercial telecommunications, the fraudsters have been causing financial damage to the companies who offered these services. At the start, the carriers didn’t have the dimension, or the users, that they have nowadays and so the amount of fraud cases wasn’t as big. This could mean that the financial damage caused, wasn’t as high as it is today, but this isn’t actually true. Indeed the amount of frauds was smaller but, the cost associated with a fraud attack in the beginning infrastructure was further big that it is nowadays. Later on and due to the technological advances, the cost of a fraud attack to the carrier has been decreasing, but on the opposite direction, the amounts of occurrences have been increasing creating a constant financial damage .
Telecommunications is wide area because it is composed by a variety of services like internet, telephones, VOIP etc., Fraud in this area is consequently an extensive subject. There are types of frauds that are characteristic of one service, like the SIM cloning fraud which is particular to the mobile phone service and there are frauds that cover a bunch of services like the subscription fraud, which is associated to the subscription of services.
Customer churn on the hand creates a huge anxiety in highly competitive service sectors especially the Telecommunications sector . The churn prediction of the mobile Telecommunication industry is on the average of 2.2% according to marketing researchers .
A number of researchers have used varied techniques to address churn in various fields. In the churn analysis and modelling of the telecommunication industry, very common algorithms deployed include Decision Trees (DT), Logistic Regression (LR), Neural Networks (NN), Naïve Bayesian Classifiers (NBC), Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Self-Organising Maps (SOM), among others.
Review of Previous Research Work
During the last few years, there has been a lot of research in the field of fraud detection and churn prediction. Sevda & Mohammad  in their paper studied data mining techniques for fraud detection. They posited that scientific data mining and business intelligence technology is valuable and somewhat hidden to provide large volumes of data. The research study used service analyzer software to analyze the annual transactions related to call records of 20000 telecom subscribers. The main data mining techniques used for fraud detection (FFD) in the study was logistic models, neural networks and decision trees, all of which provide primary solutions to the problems inherent in the detection and classification of fraudulent data. The study proposed method was clustering clients based on client type. An appropriate rule for each cluster was determined the study by the behavior of group members in case of deviation from specified behavior will be known among suspected cases. The study data were based on the type of client clustering, so each cluster representing a certain type of client, the procedure will have a different behavior. The study employed decision tree algorithm and neural network model. Models are able to extract about a lot of the rules related to client behavior. Each node in the graph model was built by selecting the corresponding table; chance percent of suspected cases have been identified.
Bhattacharyya &Pendharkar found in their study that customer retention is far more economical than customer acquisition. In their work, Artificial Neural networks (ANNs) gave the best results as compared to other conventional algorithms. Furthermore, they argued that a good prediction model has to be constantly updated and should use a combination of different data mining techniques.
In another case study , churn prediction and fraud detection was done using regression models, where each model comprised of different sets of variables and coefficients. A total of 6 regression models were used over a specific time period. Two models each of churner to non churner ratio of 1:1 and 2:3 for three different analysis periods of 4, 6 and 8 months were used. The regression model with re-sampled churner to non churner ratio of 2:3, based on data over 8 months got the best results during the testing phase. In that study, the authors concluded that due to the dynamic nature of the customer, the logistic regression models had to be updated frequently in order to achieve higher accuracy.
Bhattacharya  performed churn prediction and fraud detection using ANNs and decision trees. They found that the decision trees surpassed the former in terms of accuracy. They divided their project into five phases: Data Acquisition, Data preparation, Derived variables, Extracting Variables, and Model construction.
Gerpott et al.  studied the German mobile market based on a sample consisting of 684 residential mobile users. They were mainly concerned with finding the relationship between the three constructs: Customer retention, loyalty and satisfaction. They showed that the three constructs are causally interlinked. In addition, they identified some other factors, which have significant effects on customer retention such as mobile service price, mobile service benefit perceptions and lack of number portability. The main limitation of this study is that it investigates customer behavioral intentions and attitudes and neglects the actual customer usage behaviors. In addition, this study employs traditional statistical techniques, which are now being questioned in the churn literature.
Kim & Yoon  analyzed fraud detection and customer churn in the Korean mobile market based on a survey of 973 mobile users. They conducted an empirical analysis to identify the determinants of churn. Their results indicated that churn probability is associated with the level of satisfaction. Furthermore, they identified other services attributes that might affect churn including call quality, tariff level, handsets and brand image and subscription duration. They concluded that the main cause of customer churn in the Korean mobile market is the desire to change handsets and dissatisfaction with specific service attributes, such as call quality or price level. This study shares the same limitation as Gerpott et al.  study, namely the use of the questionnaire as a data collection tool. The concerns about potential biases in the questionnaire-based study have been reported widely in the literature. In another study of the same market conducted by Ahn et al. the limitation of using questionnaires to collect customer data was overcome by analyzing actual transactions and real customers‘ data provided by a service provider. Investigated the key determinants of churn and reported service quality, customer usage and switching costs as the main determinants of churn.
Seo et al.  analyzed customer retention in the US mobile market based on a database of 31,769 customers and call log files for one of the top ten US mobile services providers. They used binary logistic regression modeling to analyze the behavioral and the demographical factors affecting customer retention. They investigated six variables: Service plan complexity, handset sophistication, length of association, connectivity quality, age and gender. As reported by Seo et al. , one of the limitations of their study is that it was conducted before the introduction of local number portability. As noted earlier, the mobile market is changing rapidly and more recent and comprehensive research is needed.
Yan et al.  overcame the limitations of survey-based studies by using Duke Teradata‘s 2003 fraud detection and churn modeling tournament data, which contains 100,000 customer records extracted from a major mobile operator in the US. They developed a model called Churn-Strategy Alignment Model ‘to evaluate churn based on 172 variables by using factor and reliability analysis. This model offers managers a new way to define customer retention strategies and helps them to understand why customers churn, rather than focusing only on who are going to churn. However, one major drawback of this model is that it fails to consider social influences.
Hwang et al.  investigated the Korean mobile market and suggested a lifetime value model considering propensity to churn, past financial contribution and customer potential value at the same time. They used decision tree, logistic regression and neural network to develop and evaluate their model, which can be used to segment customers and develop customer retention strategies based on the customer lifetime value. Choosing the most profitable customers and retaining them while lessening or terminating relationships with less profitable customers is one of the most successful customer retention strategies to improve business profit. One important contribution of Hwang‘s et al.  study is that it adds important findings to the empirical customer churn literature by considering customer lifetime value in the churn analysis. However, the study did not take into account other important factors that strongly affect customer churn, such as social influences and market characteristics.
Hung et al.  used decision tree and neural networks to analyze customer churn and fraud in Taiwan based on data including customer demography data, billing, customer service interaction and call detail records data. Although the results of study show a significant improvement (based on lift chart) from those in early studies, one major criticism of this work is that it did not take account of social influences along with other factors that may affect customer churn.
Lejeune  applied bagging and stochastic gradient boosting (two data mining algorithms) to predict customer churn in a US mobile company. In their study, they use three groups of predictors: Behavioral, company interaction and customer demographics predictors. According to the study results, using the two algorithms performs comparably better than logistic regression. Another important finding of this study is that using ensemble classifiers produces superior performance over single classification models. In ensemble models, multiple classification models are combined into one classifier by using different methods, such as majority voting. Nevertheless, this study shares the same limitations as others in that it does not consider social influences.
In another major study, Neslin et al.  investigated the performance of fraud detection and churn prediction models of different statistical/data mining tools. This study reported the results of a tournament in which researchers use the Duke Teradata‘s churn modeling data to build churn prediction models on that data. The results of this study demonstrate that logistic regression and decision tree models outperformed traditional statistical tools such as discriminant and explanatory models. The results of this study contribute to make churn prediction more accurate, but cannot provide answers for why a customer might churn.
Ling & Sheng  identified an important shortcoming of the existing prediction models, such as logistic regression and decision tree, which is that they lack transparency and comprehensibility. They suggested incorporating domain knowledge to improve the interpretability of the resulting models. Based on two telecom data sets they demonstrated how domain knowledge could be used to improve the interpretability of predictions models.
There is need for further research work on, “Enhanced Predictive Data Mining Algorithms for Fraud Detection and Churn Behavior Modeling in Telecommunication Systems” so as to bridges a gap in knowledge by introducing data mining as an advanced machine learning approach which is applicable in Customer relationship management realm. After the significant studies regarding the customer churn from both descriptive and predictive point of view were reviewed, the issue of data imbalance in churn datasets was discussed and remedies for it were extracted from the previous studies. As it was obvious, almost all predictive models that have been developed in this realm were utilized all or some of the RFM variables as their input variables for model building [13,17-19]. There is need to conduct more study that will follow antecedents’ procedure and utilize the RFM features of customer base as the input variables in clustering phase and afterwards tailor the behavioral variables proposed by Wei & Chiu  in order to build a robust predictive model.
We can safely conclude from the existing research in the field of fraud detection and customer churn prediction, that there is not a single model that could give the highest accuracy in all of the cases. Instead, the performance of every algorithm will differ according to the characteristics of the data. Further studies should be conducted to test the conventional algorithms on data set. The study could use primary data collected from customers to create a predictive churn model that assesses customer churn rate of numerous telecommunication companies in Nigeria. This is will be unlike previous studies which were conducted in western countries. Using the IBM SPSS Modeler 18 and Rapid Miner tools, the study could generate models created by algorithms like C5.0 Decision tree algorithm, the Logistic Regression algorithm or the Discriminant Analysis algorithm. A comparative evaluation could be performed to discover the optimal model with accurate, consistent and reliable results.
Naive Bayes model classifier could be developed to do the anomaly/fraud detection on these experiments. Naive Bayes is a supervised learning classification algorithm that applies the Bayes theorem with an assumption of independency between the data features.
Many related studies focus on the step of developing prediction models and model evaluation in the data mining process, but none considered the data pre-processing step using data mining technique such as IBM SPSS Modeler 18 and Rapid Miner tools. That is, using some related technique, e.g. association rules to pre-process data may be able to improve the final performance of prediction models. As data pre-processing is an important step in the data mining process to extract useful and representative features from the original data, association rules have not been used in the data reduction step in the literature.
New research work could use three algorithms (C5.0 Decision tree algorithm, Logistic Regression algorithm and Discriminant Analysis algorithm) and performed a comparative evaluation to discover the optimal model with accurate, consistent and reliable results.
There is need for further studies to examine whether association rules can be adapted in the data pre-processing stage to reduce a large amount of information to a small and more understandable data variable in order to improve the prediction performance of using C5.0 Decision tree algorithm, Logistic Regression algorithm and Discriminant Analysis algorithm as the prediction models. The combination of the data reduction and model development steps using data mining techniques could investigate, unlike previous studies, for the problem of fraud detection and customer churn prediction.
The disadvantage of existing models and algorithms reviewed is that complex interactions among variables and attributes can affect the performance of most of the models. In addition, it becomes very complicated and difficult to visualize and interpret the models when interactions among variables and attributes become complex. Other reported disadvantages of existing models in the reviewed works are their lack of robustness and their over-sensitivity to training data sets .
With the ever-increasing complexity of the mobile telecommunication market, novel and efficacious algorithms such as C5.0 Decision tree, Logistic Regression and Discriminant Analysis needs to be evaluated for the purpose of customer churn prediction and fraud detection. C5.0 Decision tree algorithm, Logistic Regression algorithm and Discriminant Analysis algorithm are able to process both numerical and categorical data.
In addition, the modeling techniques and algorithms are relatively simple and have the ability to explain the relationship between input and output variables. They are easy to understand and visualize. Moreover, the decision process can be simplified to a set of business rules as demonstrated earlier. It is a nonparametric method; therefore, no prior assumptions about the data are needed. Hence, these are potential areas for further research.
- Buckinx W, Moons E, Van Den Poel D, Wets G (2016) Customeradapted coupon targeting using feature selection. Expert Systems with Applications 26(4): 509-518
- Bhattacharyya S, Pendharkar P (1998) Inductive, evolutionary and neural techniques for discrimination: Acomparative study. Decision Sciences 29(4): 871-900.
- Groth R (2015) Data Mining: Building Competitive Advantage, Santa Clara, CA: Prentice Hall, USA.
- Carrier C, Povel O (2013) Characterizing data mining software. Intelligent Data Analysis 7: 181-192.
- Sevda S, Mohammad AB (2015) The Study of Fraud Detection in Financial and Credit Institutions with Real Data. Computer Science and Engineering 5(2): 30-36.
- Berson A, Smith S, Therling K (2015) Building data mining applications for CRM. McGraw-Hill, New York, USA
- Bhattacharya C (1998) When customers are members: customer retention in paid membership contexts. Journal of the Academy of Marketing Science 26(1): 31-44.
- Gerpott T, Rams W, Schindler A (2016) Customer retention, loyalty, and satisfaction in the German mobile cellular telecommunications market. Telecommunications Policy 25(4): 249-269.
- Kim H, Yoon C (2016) Determinants of subscriber churn and customer loyalty in the Korean mobile telephony market. Telecommunications Policy 28(9-10): 751-765.
- Seo D, Ranganathan C, Babad Y (2014) Two-level model of customer retention in the US mobile telecommunications service market. Telecommunications Policy 32(3-4): 182-196.
- Yan L, Fassino M, Baldasare P (2012) Predicting Customer Behavior via Calling Links. Proceedings of International Joint Conference on Neural Networks, pp. 2555-2560.
- Hwang H, Jung T, Suh E (2016) An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry. Exper Systems with Applications 26(2): 181-188.
- Hung S, Yen D, Wang H (2010) Applying data mining to telecom churn management. Expert Systems with Applications 31(3): 515-524.
- Lejeune MA (2016) Measuring the impact of data mining on churn management. Internet Research: Electronic Networking Applications and Policy 11(5): 375-387.
- Neslin S, Gupta S, Kamakura W, Lu J, Mason C (2006) Defection Detection: Measuring and understanding the predictive accuracy of customer churn models. Journal of Marketing Research XLIII: 204-211.
- Ling C, Sheng V (2014) Cost-sensitive learning and the class imbalance problem. Inc. Sammut, Encyclopedia of Machine Learning. Springer, USA.
- Wei C, Chiu I (2002) Turning telecommunications call details to churn prediction: a data mining approach. Expert Systems with Applications 23(2): 103-112.
- Coussement K, Van den Poel D (2014) Integrating the voice of customers through call center emails into a decision support system for churn prediction. Information & Management 45(3): 164-174.
- Coussement K, Van den Poel D (2009) Improving customer attrition prediction by integrating emotions from client/company interaction emails and evaluating multiple classifiers. Expert Systems with Applications 36(3 part 2): 6127-6134.
- Burez J, Van den Poel D (2009) Handling class imbalance in customer churn prediction. Expert System with Applications 36(3 part 1): 4626- 4636.
- Berry M, Linoff G (2016) Data Mining Techniques for Marketing, Sales, and Customer Relationship Management (2nd edn), Indianapolis: Wiley Publishing Inc, USA.