Abstract
1. Introduction
2. Malware and machine learning
3. A model to predict advanced malware
4. Analysis of malware features with machine learning
5. Conclusion and future work
Declaration of Competing Interest
Acknowledgment
Appendix A. Supplementary materials
Research Data
References
Abstract
The growth of cyber-attacks that are carried out with malware have become more sophisticated on almost all networks. Furthermore, attacks with advanced malware have the greatest complexity which makes them very hard to detect. Advanced malware is able to obfuscate much of their traces through many mechanisms, such as metamorphic engines. Therefore, predictions and detections of such malware have become significant challenge for malware analyses mechanisms. In this paper, we propose a multidimensional machine learning approach to predict Stuxnet like malware from a dataset that consists of malware samples by using five distinguishing features of advanced malware. We define the features by analyzing advanced malware samples in the wild. Our approach uses regression models to predict advanced malware. We create a malware dataset from existing datasets that contain real samples for experimental purposes. Analyses results show that there are high correlations among some features of advanced malware. These provide better predictions scores, such as R2 = 0.8203 score for Stuxnet closeness feature. Experimental analyses show that our approach is able to predict Stuxnet like advanced malware if prediction features defined.
Introduction
The amount and variety of attacks on computing systems including all types of networks increase in an enormous speed. This trend is driven by a rich volume of different malware. The richness has a huge impact on the cost of computing systems. Actually, the cost depends on the success of attacks. Advanced malware has become an effective tool to accomplish such attacks. Advanced malware is a complex malicious software which has very effective properties. The main purpose of such malware is to accomplish targeted attacks with high success ratio. Specifically, critical systems are main targets of advanced malware. This type of malware uses different attack vectors to accomplish its goal and it has exceptionally complex structure [1]. Moreover, advanced malware may use conventional malware to increase the success ratio, such as using ransomware [2]. Therefore, many systems and networks have suffered from advanced malware considerably. For instance, financial systems and critical networks are the targets of such malware [3,4]. Recently, malware is used in many complex targeted attacks. Existing anti-malware systems and intrusion detection systems are able to detect some traces of attacks if they are carried out with conventional malware. In this paper, we distinguish malware in two categories, namely conventional and advanced malware as in [5]. Conventional malware is malicious software that are already categorized in literature, such as virus, worm, and etc. [6]. Moreover, this type of malware is almost always detectable with adequate anti-malware systems [7]. On the other hand, advanced malware has been undetectable until the attack is completed [5].