Abstract
I. Introduction
II. Related Work
III. Workload Characterization
IV. Design Considerations
V. Implementation
Authors
Figures
References
Abstract
With the rapid development of deep convolutional neural networks, more and more computer vision tasks have been well resolved. These convolutional neural network solutions rely heavily on the performance of the hardware. However, due to privacy issues or the network instability, we need to run convolutional neural networks on embedded platforms. Critical challenges will be raised by limited hardware resources on the embedded platform. In this paper, we design and implement an embedded inference framework to accelerate the inference of the convolutional neural network on the embedded platform. For this, we first analyzed the time-consuming layers in the inference process of the network, and then we design optimization methods for these layers. Also, we design a memory pool specifically for neural networks. Our experimental results show that our embedded inference framework can run a classification model MobileNet in 80ms and a detection model MobileNet-SSD in 155ms on Firefly-RK3399 development board.
Introduction
Convolutional neural network (CNN) plays a very important role in the field of computer vision. Deep Convolutional neural network has greatly promoted the development of computer vision, especially in object recognition, object detection and semantic segmentation. Since AlexNet [1] won the ImageNet Challenge: ILSVRC 2012 [2], in order to get higher accuracy, the CNN has become deeper and more complex, which has become the trend of designing network [3]–[5]. However, in many real word applications such as self-driving car, robotics and augmented reality, convolutional neural networks need to be deployed on an embedded platform with limited computing resources. Many embedded applications often rely on a cloud-based approach [6]–[12]. In cloud-based approach, embedded platform is only used to capture data, and the inference process is completed on the server. A cloud-based approach enables the user of embedded devices to enjoy the huge benefits of convolutional neural networks. However, a cloud-based approach has its disadvantages. First, due to the communication costs, the cloud-based applications depend heavily on network quality. Therefore, in order to ensure the practicability of the application, we need to limit the amount of data sent by the embedded platform. Second, cloud-based approaches may involve private data, and sending personal data to the cloud is a challenge [13]. With the rapid development of 5G, there will be a very attractive solution. However, uploading the data from embedded platforms to cloud can cause privacy problems.