خلاصه
معرفی
فرمول مسأله
کنترل DRL مبتنی بر SAC برای کاهش حملات سایبری پیشنهاد شده است
نتایج عددی
منابع
Abstract
Introduction
Problem Formulation
Proposed Extended SAC-Based DRL Control for Cyber Attack Mitigation
Numerical Results
REFERENCES
چکیده
استفاده از قابلیتهای اینورتر هوشمند منابع انرژی توزیعشده (DER) قابلیت اطمینان شبکه را افزایش میدهد، اما در عین حال آسیبپذیریهای بیشتری را در برابر حملات سایبری نشان میدهد. این مقاله یک رویکرد دفاعی مبتنی بر یادگیری تقویت عمیق (DRL) را پیشنهاد میکند. مشکل دفاعی به عنوان یک فرآیند تصمیم گیری مارکوف برای کنترل DERها و به حداقل رساندن کاهش بار برای رسیدگی به نقض ولتاژ ناشی از حملات سایبری مجدداً فرموله شده است. روش اصلی نرمافزار منتقد (SAC) برای کنشهای پیوسته به منظور رسیدگی به اقدامات گسسته و پیوسته برای کنترل نقاط تنظیمی DERها و سناریوهای loadshedding گسترش یافته است. نتایج مقایسه عددی با سایر رویکردهای کنترلی، مانند Volt-VAR و Volt-Watt بر روی گره 33 IEEE اصلاح شده، نشان میدهد که روش پیشنهادی میتواند به تنظیم ولتاژ بهتری دست یابد و تلفات توان کمتری در حضور حملات سایبری داشته باشد.
Abstract
The use of smart inverter capabilities of distributed energy resources (DERs) enhances the grid reliability but in the meanwhile exhibits more vulnerabilities to cyber-attacks. This paper proposes a deep reinforcement learning (DRL)-based defense approach. The defense problem is reformulated as a Markov decision making process to control DERs and minimizing load shedding to address the voltage violations caused by cyber-attacks. The original soft actor-critic (SAC) method for continuous actions has been extended to handle discrete and continuous actions for controlling DERs' setpoints and loadshedding scenarios. Numerical comparison results with other control approaches, such as Volt-VAR and Volt-Watt on the modified IEEE 33-node, show that the proposed method can achieve better voltage regulation and have less power losses in the presence of cyber-attacks.
Introduction
S Mart control of distributed energy resources (DER) in distribution systems is bringing a fundamental shift in how these networks are maintained within the security limits. Historically, control methods were designed based on conventional approaches, where cyber threats have not been paid attention [1]. The idea of introducing internet protocols in the electrical network to use more advanced protection and control components has created the need to defend the cyberattacks. Furthermore, a study on [2], [3] has showed that only 62% of cyber-attacks can be recognized after they cause massive damage to the system, which makes it a critical issue for system designers. Nowadays, the digital transformation of the electrical distribution systems has forced lots of restrictions and regulations that must be applied for achieving a secure and resilient system [4]. In the context of cyberphysical security [5], [6], smart attackers can initiate false data injection attacks (FDI) [7], where a slight change in any of the controllable devices (i.e., smart inverters, smart ring main units and digital relays), can result in disturbing the networking security without being detected by existing defense approaches. In this paper, we propose a learning-based approach for the mitigation of cyber-attacks on connected loads and DERs. Deep reinforcement learning (DRL) was opted for its superior capability of learning the power system constraints and achieving optimal control strategy
Numerical Results
TS A modified IEEE 33-node system with three-phase loads and 4 DERs that are utility-owned (i.e., each DER consists of 1 ES unit and 1 PV with installed capacities of 500 kW each unit), is used for testing. The system is modeled using OpenDSS and is configured to be in the grid connected mode. At the first solution evaluated using OpenDSS, no voltage violation has been observed during the normal operation. In addition, 4 tie switches have been added to the network an sectionalizers are considered for operation. The learning environment is designed according to OpenAI Gym [14], which is a common interfacing library to define DRL environment for the agent. The SAC algorithm is implemented using PyTorch. Specifically, in the SAC, both actor and critic networks are designed as feed-forward neural networks with three hidden layers of 50, 100 and 50 neurons and a ReLU activation function for each layer. Other SAC hyper-parameters are as follows: Adam optimizer is used with a learning rate of 0.0001 and discount factor γ is set to 0.9. The target network is updated by tau = 0.001 and random process is applied for better exploration with α = 0.1, β = 0.1 ρ = 10 and µ = 0.1; the replay buffer size is 100000 with batch size 256. The offline DRL training spends around 3 hours and 30 minutes on a laptop computer with 3.6GHz Intel i7 processor and 32.0 GB RAM. The proposed DRL defense algorithm is compared with other control algorithms, such as the Volt-VAR and Volt-Watt using the default control curves in OpenDSS [15]). Also, the MPC algorithm is implemented following [16] based on the problem formulation in Section II using the same control and state variables for the proposed DRL algorithm. Attacks are initiated for few timesteps by changing the load and DER setpoints (% power change of loads and/or DERs) in OpenDSS using python interface.