چکیده
مقدمه
II. مطالب مرتبط
III. روش شناسی
IV. آزمایش ها
V. تجزیه و تحلیل نتایج
VI. بحث
VII. نتیجه گیری
منابع
Abstract
I. Introduction
II. Related Work
III. Methodology
IV. Experiments
V. Results Analysis
VI. Discussion
VII. Conclusion
References
چکیده
در حال حاضر، برنامه های محاسباتی هوشمند به طور گسترده در حوزه های مختلف از جمله فروشگاه های خرده فروشی استفاده می شود. تجزیه و تحلیل رفتار مشتری به نفع مشتریان و خرده فروشان بسیار مهم شده است. در این راستا، مفهوم تخمین نگاه از راه دور با استفاده از یادگیری عمیق نتایج امیدوارکنندهای را در تحلیل رفتار مشتری در خردهفروشی به دلیل مقیاسپذیری، استحکام، هزینه کم و ماهیت بدون وقفه نشان داده است. این مطالعه یک شبکه عصبی کانولوشن عمیق سه مرحلهای مبتنی بر توجه را برای تخمین نگاه از راه دور در خردهفروشی با استفاده از دادههای تصویری ارائه میکند. در مرحله اول، مکانیزمی برای تخمین نگاه سه بعدی سوژه با استفاده از داده های تصویر و تخمین عمق تک چشمی طراحی می کنیم. مرحله دوم یک مکانیسم جدید با سه توجه را برای تخمین نگاه در طبیعت از روی میدان دید، محدوده عمق و توجه کانال شی ارائه می دهد. مرحله سوم نقشه حرارتی برجستگی نگاه را از نقشه توجه خروجی مرحله دوم تولید می کند. ما مدل پیشنهادی را با استفاده از مجموعه داده های GOO-Real معیار آموزش و ارزیابی می کنیم و نتایج را با مدل های پایه مقایسه می کنیم. علاوه بر این، ما با معرفی مجموعه دادههای جدید Retail Gaze، مدل خود را با محیطهای خردهفروشی واقعی تطبیق میدهیم. آزمایش های گسترده نشان می دهد که رویکرد ما به طور قابل توجهی عملکرد برآورد هدف نگاه از راه دور را در مجموعه داده های GOO-Real و Retail Gaze بهبود می بخشد.
توجه! این متن ترجمه ماشینی بوده و توسط مترجمین ای ترجمه، ترجمه نشده است.
Abstract
At present, intelligent computing applications are widely used in different domains, including retail stores. The analysis of customer behaviour has become crucial for the benefit of both customers and retailers. In this regard, the concept of remote gaze estimation using deep learning has shown promising results in analyzing customer behaviour in retail due to its scalability, robustness, low cost, and uninterrupted nature. This study presents a three-stage, three-attention-based deep convolutional neural network for remote gaze estimation in retail using image data. In the first stage, we design a mechanism to estimate the 3D gaze of the subject using image data and monocular depth estimation. The second stage presents a novel three-attention mechanism to estimate the gaze in the wild from field-of-view, depth range, and object channel attentions. The third stage generates the gaze saliency heatmap from the output attention map of the second stage. We train and evaluate the proposed model using benchmark GOO-Real dataset and compare results with baseline models. Further, we adapt our model to real-retail environments by introducing a novel Retail Gaze dataset. Extensive experiments demonstrate that our approach significantly improves remote gaze target estimation performance on GOO-Real and Retail Gaze datasets.
Introduction
In today’s world, retail stores are becoming smarter with the availability of numerous data and the power to analyze them autonomously. Even with the rise of online shopping, most of the physical retail stores use smart applications for the purchasing process [1]. Several techniques and devices have been introduced to automate the shopping process and analyze shoppers’ behaviour inside stores. At the same time, the shopping experience is a key consideration towards the success of a retail business, which affects the performance of customer satisfaction, customer purchase probability, and customer loyalty [2]–[3][4].
In order to improve the shopping experience and maximize business profits, it is essential to capture and and analyze the customer’s behaviours without interfering their natural shopping journey [5], [6]. Various solutions have introduced for customer behaviour analyzis in retail using developments in computer vision technology. For instance, counting the number of people and detecting the hot spots in retail [6] and public [7], and tracking shoppers’ emotion [5] are such applications. However, the existing solutions only capture coarse touch-points of a shopper’s journey and vulnerable to unconstrained environment settings. With the adaptation of computer vision technologies in gaze estimation, there has been eye tracking-based solutions for customer behaviour analysis in retail as well [1], [8]. Moreover, there are solutions based on virtual reality devices and head-mounted displays, wearable eye tracker based solutions [9], and non-intrusive 3D eye tracking solutions [10]. However, these solutions do not completely satisfy the retailers due to high cost of 3D eye tracking solutions, unscalability of wearable, and head-mounted display-based solutions, and manual calibration of eye tracking systems.
Conclusion
Remote gaze saliency estimation in retail is a novel concept that has a significant potential towards building innovative retail stores. In this study, we researched the application of remote gaze saliency estimation for non-interruptive, low-cost, and scalable customer behaviour analysis in retail. We proposed a Depth-based Dual Attention model, a three-stage, three-attention-based deep CNN for gaze saliency estimation from back-head images in the wild. We developed four design solutions to comprehensively represent the parameters of gaze saliency estimation problem in retail and introduced the novel object channel and depth-rebasing components as hand-designed features, designed in our two preceding model architectures and then combined in the final model.
Extensive quantitative and qualitative analysis on the benchmark GOO-Real dataset demonstrates the superiority of the proposed models and the importance of our introduced hand-designed components. Our proposed solution improved 33% for angular error compared to the current best work in the literature. Furthermore, we introduced Retail Gaze, a real-world retail gaze saliency estimation dataset, to ensure the validity and applicability of our proposed solution in real retail environments. The proposed solution achieved an angular error of 15.3° on the Retail Gaze dataset, which demonstrates that it performs favourably in real retail environments.