Abstract
Keywords
Introduction
FL-enabled IDS for IoT scenarios
Related work
Methodology
Evaluation results
Challenges and research directions
Conclusions
CRediT authorship contribution statement
Declaration of Competing Interest
Acknowledgments
References
Abstract
The application of Machine Learning (ML) techniques to the well-known intrusion detection systems (IDS) is key to cope with increasingly sophisticated cybersecurity attacks through an effective and efficient detection process. In the context of the Internet of Things (IoT), most ML-enabled IDS approaches use centralized approaches where IoT devices share their data with data centers for further analysis. To mitigate privacy concerns associated with centralized approaches, in recent years the use of Federated Learning (FL) has attracted a significant interest in different sectors, including healthcare and transport systems. However, the development of FL-enabled IDS for IoT is in its infancy, and still requires research efforts from various areas, in order to identify the main challenges for the deployment in real-world scenarios. In this direction, our work evaluates a FL-enabled IDS approach based on a multiclass classifier considering different data distributions for the detection of different attacks in an IoT scenario. In particular, we use three different settings that are obtained by partitioning the recent ToN_IoT dataset according to IoT devices’ IP address and types of attack. Furthermore, we evaluate the impact of different aggregation functions according to such setting by using the recent IBMFL framework as FL implementation. Additionally, we identify a set of challenges and future directions based on the existing literature and the analysis of our evaluation results.
Introduction
Nowadays, the constant development and deployment of Internet of Things (IoT) technologies is increasing the attack surface of physical devices that could be potentially exploited by malicious entities [1]. Well-known attacks, such as the Mirai botnet and recent variants [2], demonstrate the need to strengthen IoT devices’ security in order to protect large-scale IoT-enabled systems. Due to the development of such increasingly sophisticated attacks, in recent years the use of machine learning (ML) techniques has been widely considered for the detection and mitigation of these attacks in IoT scenarios. Indeed, the application of ML techniques has been proposed in recent works to improve the detection capabilities of the well-known intrusion detection systems (IDS) through the application of diverse techniques (e.g., neural networks) to infer potential attacks based on the analysis of network traffic [3]. Despite the advantages provided by the application of ML techniques to enhance IDS approaches (e.g., in terms of attack detection accuracy), most of such ML-enabled IDS deployments arecentralized, so that a single entity receives the network traffic data from different devices to train a certain ML model. Therefore, this entity has access to the whole network traffic derived from the communication of the different devices participating in the training process and also devices’ local data, which could lead to privacy issues. This problem could be exacerbated in IoT scenarios due to the amount and sensitivity of the information exchanged through certain devices, such as wearable or eHealth systems [4]; therefore, decentralized data management solutions are of paramount importance [5].