چکیده
مقدمه
یادگیری تقویتی
فرمول بندی مسأله
مطالعه موردی
نتایج و بحث ها
نتیجه گیری
منابع
Abstract
Introduction
Reinforcement learning
Problem formulation
Case study
Results & discussions
Conclusion
References
چکیده
این مقاله یک رویکرد مبتنی بر یادگیری تقویت عمیق را برای مدیریت بهینه منابع انرژی مختلف در یک ریزشبکه پیشنهاد میکند. روش پیشنهادی رفتار تصادفی عناصر اصلی را در نظر میگیرد که شامل مشخصات بار، مشخصات تولید و سیگنالهای قیمتگذاری میشود. مسئله مدیریت انرژی به عنوان یک افق محدود فرآیند تصمیم گیری مارکوف (MDP) با تعریف حالت، عمل، پاداش و توابع هدف، بدون اطلاع قبلی از احتمالات انتقال، فرموله شده است. چنین فرمول بندی نیازی به مدل صریح ریزشبکه، استفاده از داده های انباشته شده و تعامل با ریزشبکه برای استخراج خط مشی بهینه ندارد. یک الگوریتم یادگیری تقویتی کارآمد مبتنی بر شبکه های Q عمیق برای حل فرمول توسعه یافته پیاده سازی شده است. برای تایید اثربخشی چنین روش شناسی، یک مطالعه موردی بر اساس یک ریزشبکه واقعی اجرا شده است. نتایج روش پیشنهادی توانایی آن را برای به دست آوردن زمانبندی آنلاین منابع انرژی مختلف در یک ریزشبکه با اقدامات مقرونبهصرفه تحت شرایط تصادفی نشان میدهد. هزینه های عملیاتی به دست آمده در 2٪ از هزینه های به دست آمده در برنامه بهینه است.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
This paper proposes a deep reinforcement learning-based approach to optimally manage the different energy resources within a microgrid. The proposed methodology considers the stochastic behavior of the main elements, which include load profile, generation profile, and pricing signals. The energy management problem is formulated as a finite horizon Markov Decision Process (MDP) by defining the state, action, reward, and objective functions, without prior knowledge of the transition probabilities. Such formulation does not require explicit model of the microgrid, making use of the accumulated data and interaction with the microgrid to derive the optimal policy. An efficient reinforcement learning algorithm based on deep Q-networks is implemented to solve the developed formulation. To confirm the effectiveness of such methodology, a case study based on a real microgrid is implemented. The results of the proposed methodology demonstrate its capability to obtain online scheduling of various energy resources within a microgrid with optimal cost-effective actions under stochastic conditions. The achieved costs of operation are within 2% of those obtained in the optimal schedule.
Introduction
A microgrid is defined as a group of loads and micro-sources operating under the control of one system [1]. The microgrid could operate in parallel with the utility grid to optimally consume local power generation sources, or in islanded mode in case of failure in the main grid, thus enhancing overall reliability of local service. Many benefits are realized by adopting microgrids, including the reduction of greenhouse gases emission, improving voltage profiles, decentralization of power supply and reducing line losses [2]. It also allows customers to actively participate in the microgrid operation [3].
The decrease of renewable generation costs has driven the adoption of microgrid schemes. For example, the cost of manufacturing solar PV has seen noticeable reduction over the past years, which was accompanied by a huge increase in installations. As a matter of fact, the global solar PV energy capacity grew nearly ten times within one decade, from 72 GW in 2011 to more than 707 GW in 2020 [4]. Nonetheless, there are several challenges to integrate renewable distributed generation resources, which mainly stem from their intermittent nature, making it difficult to optimally schedule generation as practiced in a conventional grid. Unforeseen power variations would necessitate the commitment of expensive reserve or ancillary services, making the microgrid operation uneconomical.
Conclusion
This paper proposed the use of deep reinforcement learning methodology based on deep Q-network algorithm to solve the energy management problem formulation of a given microgrid. The methodology considered the stochastic behavior of different elements of a microgrid, and modeled different grid elements while adhering to the various power flow constraints in a realistic setting. Such formulation tackles some gaps that exist in conventional methods, such as dependability on experts to model the dynamics of the microgrid, achieving real-time scheduling, and providing a generalized framework for different environments. The developed framework can be adjusted for different microgrid architectures providing flexibility to test the set-up for different types of electrical grids. In this study, it was demonstrated that the deep reinforcement learning methodologies as implemented can obtain near optimal results. The methodology can conduct online scheduling of the various energy resources within a grid and make cost effective actions under stochastic conditions. The results were benchmarked with the optimal results obtained by MILP solver that had full knowledge of the different stochastic variables. The costs of operation achieved are within 2% of the optimal case scenario. Additionally, the computation time of the developed method is on average 0.01 s per step compared to 2.4 s with the MILP solver. Therefore, the potential of the proposed approach for real-time implementation is quite significant.