In this paper, ensemble models are developed to accurately forecast software reliability. Various statistical (multiple linear regression and multivariate adaptive regression splines) and intelligent techniques (backpropagation trained neural network, dynamic evolving neuro–fuzzy inference system and TreeNet) constitute the ensembles presented. Three linear ensembles and one non-linear ensemble are designed and tested. Based on the experiments performed on the software reliability data obtained from literature, it is observed that the non-linear ensemble outperformed all the other ensembles and also the constituent statistical and intelligent techniques. 2007 Elsevier Inc. All rights reserved.
Software reliability is defined as the probability of failure-free software operation for a specified period of time in a specified environment (ANSI definition). Software reliability modeling has gained a lot of importance in the recent years. Criticality of software in many of the present day applications has led to a tremendous increase in the amount of work being carried out in this area. The use of intelligent neural network and hybrid techniques in place of the traditional statistical techniques have shown a remarkable improvement in the prediction of software reliability in the recent years. Among the intelligent and the statistical techniques it is not easy to identify the best one since their performance varies with the change in data.
In this paper, an ensemble-based approach is followed in predicting software reliability. Specifically, a non-linear ensemble trained using backpropagation neural network (BPNN) is proposed. The proposed approach takes the advantage of all the techniques’ prediction capabilities towards the data and appropriately assigns weights to each of the techniques based upon their performance.
The rest of the paper is organized in the following manner. In Section 2, a brief review of the works carried out in the area of software reliability prediction in research is presented. In Section 3, the various stand-alone intelligent methods that are applied in this paper are described briefly. In Section 4, the four ensembles that are developed are presented. Section 5 presents the experimental methodology; discussion of the results is presented in Section 6. In Section 7, the application of this in accurately modeling operational risk in banks is presented. Finally, Section 8 concludes the paper.
2. Literature survey
In the last few years many research studies has been carried out in this area of software reliability modeling and forecasting. They included the application of neural networks, fuzzy logic models; Genetic algorithms (GA) based neural networks, recurrent neural networks, Bayesian neural networks, and support vector machine (SVM) based techniques, to name a few. Cai et al. (1991) advocated the development of fuzzy software reliability models in place of probabilistic software reliability models (PSRMs).
Their argument was based on the proof that software reliability is fuzzy in nature. A demonstration of how to develop a fuzzy model to characterize software reliability was also presented. Karunanithi et al. (1992) carried out a detailed study to explain the use of connectionist models in software reliability growth prediction. It was shown through empirical results that the connectionist models adapt well across different datasets and exhibit better predictive accuracy than the well-known analytical software reliability growth models. Sitte (1999) made a comparative study of neural networks and parametric-recalibration models in software reliability prediction and found neural networks to be much simpler to use and also to be better predictors. Also, through empirical results it was shown that the neural network models are better trend predictors. Ho et al. (2003) performed a comprehensive study of connectionist models and their applicability to software reliability prediction and found them to be better and more flexible than the traditional models. A comparaitive study was performed between their proposed modified Elman recurrent neural network, with the more popular feedforward neural network, the Jordan recurrent model, and some traditional software reliability growth models. Numerical results show that the proposed network architecture performed better than the other models in terms of predictions. Despite of the recent advancements in the software reliability growth models, it was observed that different models have different predictive capabilities and also no single model is suitable under all circumstances.
Tian and Noore (2005a) proposed an on-line adaptive software reliability prediction model using evolutionary connectionist approach based on multiple-delayed-input single-output architecture. The proposed approach, as shown by their results, had a better performance with respect to next-step predictability compared to existing neural network model for failure time prediction. Tian and Noore (2005b) proposed an evolutionary neural network modeling approach for software cumulative failure time prediction. Their results were found to be better than the existing neural network models. It was also shown that the neural network architecture has a great impact on the performance of the network. According to Bai et al. (2005) Bayesian networks show a strong ability to adapt in problems involving complex variant factors. They developed a software prediction model based on Markov Bayesian networks, and a method to solve the network model was proposed. Reformat (2005) proposed an approach leading to a multitechnique knowledge extraction and development of a comprehensive meta-model prediction system in the area of corrective maintenance of software. The system was based on evidence theory and a number of fuzzy-based models. In addition they carried out a detailed case study for estimating the number of defects in a medical imaging system using the proposed approach. Pai and Hong (2006) have applied support vector machines (SVMs) for forecasting software reliability where simulated annealing (SA) algorithm was used to select the parameters of the SVM model. The experimental results show that the proposed model gave better predictions than the other compared methods. Su and Huang (2006) showed how to apply neural networks to predict software reliability. Further they made use of the neural network approach to build a dynamic weighted combinational model (DWCM) and experimental results show that the proposed model gave significantly better predictions. Also recently, neural networks were applied for predicting faults in object-oriented software (Kanmani et al., 2007). The study showed neural network models to be performing much better than the statistical methods.
Application of intelligent techniques in place of the statistical techniques has increased by leaps and bounds in the recent years. Application of Soft Computing techniques in software reliability engineering has come up recently (Madsen et al., 2006). Despite the recent advancements in the software reliability growth models, it was observed that different models have different predictive capabilities and also no single model is suitable under all circumstances. An ensemble uses the output obtained from the individual constituents as inputs to it and the data is processed according to the design of the arbitrator lying at the heart of the ensemble.