چکیده
1. مقدمه
2. مطالعات مرتبط
3. روش شناسی
4. آزمایشات
5. بحث
6. نتیجه گیری
منابع
Abstract
1. Introduction
2. Related works
3. Methodology
4. Experiments
5. Discussion
6. Conclusion
Disclosure statement
Notes on contributors
References
چکیده
حمل و نقل خودران گاری تدارکات به دلیل پویایی پیچیده سبد خرید یک مشکل چالش برانگیز است. در این مقاله، با استفاده از سیستم دو ربات با یادگیری تقویتی به حل این مشکل می پردازیم. ما مسئله را به عنوان مشکل ایجاد یک گاری لجستیکی در مسیری قوس فرموله می کنیم. کنترل کننده یادگیری تقویتی (RL) ما از یک کنترل کننده بازخورد و یادگیری تقویتی باقیمانده تشکیل شده است. کنترل کننده بازخورد، گاری تدارکات را به عنوان یک رهبر مجازی و ربات ها را به عنوان پیروان در نظر می گیرد، و موقعیت و سرعت روبات ها برای حفظ شکل گیری بین چرخ دستی تدارکات و روبات ها کنترل می شود. یادگیری تقویتی باقیمانده برای اصلاح خروجی مدل دیگر استفاده می شود. نتایج شبیهسازی نشان داد که کنترلکننده یادگیری تقویتی باقیمانده که در یک محیط شبیهسازی فیزیکی آموزش داده شده، بهتر از سایر روشها، به ویژه در شرایط با انحنای مسیر بزرگ، عمل میکند. علاوه بر این، کنترل کننده یادگیری تقویتی باقیمانده را می توان بدون یادگیری اضافی در یک محیط واقعی به یک ربات دنیای واقعی منتقل کرد.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
Autonomous logistics cart transportation is a challenging problem because of the complicated dynamics of the logistics cart. In this paper, we tackle the problem by using two robots system with reinforcement learning. We formulate the problem as the problem of making a logistics cart track an arc trajectory. Our reinforcement learning (RL) controller consists of a feedback controller and residual reinforcement learning. The feedback controller regards a logistics cart as a virtual leader and robots as followers, and the robots' position and velocity are controlled to maintain the formation between the logistics cart and the robots. Residual reinforcement learning is used to modify the other model's output. Simulation results showed that the residual reinforcement learning controller trained in a physical simulation environment performed better than other methods, especially under the condition with a large trajectory curvature. Moreover, the residual reinforcement learning controller can be transferred to a real-world robot without additional learning in a real-world environment.
Introduction
Object transportation is increasingly being automated through the use of automated guided vehicles (AGVs) in large warehouses. However, this is less common in smaller warehouses, where objects are typically conveyed by human workers with logistics carts because existing automation systems by AGVs are not supported to transport existing logistics carts. For example, a space below a logistics cart is too small to move under and lift it. To address this, an automated object transportation system for these warehouses [1] was proposed. In this system, the robot’s position is estimated by utilizing images from a camera on the ceiling, and two robots grasp a logistics cart and transport it as shown in Figure 1. The strategy of having two robots hold a logistics cart makes it possible to automate the transportation without additional equipment. However, control for transporting a logistics cart remains a difficult problem because robots need to keep holding the cart. There is currently no method for making a logistics cart track a trajectory
Conclusion
We proposed a system for logistics cart transportation with a residual reinforcement learning controller. The proposed controller is more sample efficient than a reinforcement learning controller trained from scratch and has a higher performance than the feedback controller. We showed that using simulation reduces the cost of gathering experiences, and the results of realworld experiments suggest that the residual reinforcement learning controller learned in a simulation environment can be transferred to real-world control.
As future work, we will investigate how to make the controller higher performance than the feedback controller in all available states and reduce the difference between simulation and real world.