ABSTRACT
I. INTRODUCTION
II. RELATED WORK
III. ALGORITHM DESIGN
IV. EXPERIMENT AND ANALYSIS
V. CONCLUSION
REFERENCES
ABSTRACT
Currently, the non-maximum suppression (NMS) algorithm is a commonly used method in the post-processing stage of object detection. However, the NMS algorithm cannot effectively eliminate missing and false object detection results because of the simple constraint condition. To solve the problem of the poor detection effect in highly overlapping dense object scenes in the traditional NMS algorithm, we design an RGB-D object detection network model based on the YOLO v3 framework, and using level-by-level metaphase fusion on the RGB and depth information, we propose an improved NMS algorithm which fuses depth characteristics. According to the depth of the object in the detection boxes, it is determined whether another object is the same object in highly overlapping detection boxes, and the average depth of the internal pixels in the detection boxes is calculated as a penalty term, then the penalty term is added to the detection box score to obtain a new constraint condition for non-maximum suppression. The experimental results on the NYU Depth V2 dataset show that the mean average precision (mAP) of the Depth Fusion NMS algorithm proposed in this paper is 0.8%, 0.5% and 0.3% higher than those of the Greedy-NMS, Soft NMS-L and Soft NMS-G methods, respectively. After comparison and analysis, our method can not only detect more overlapping objects but also achieve a better object localization accuracy.
INTRODUCTION
Object detection is an important research direction in the field of computer vision. The process can be understood as visual algorithm giving the computer a human-like visual recognition ability to identify object categories and obtain the object location information in scenes through an image obtained by a sensor. In recent years, with the rapid development of deep learning and neural network technology, the research on object detection has resulted in breakthroughs in the areas of monitoring security, automatic driving, human-computer interaction and so on [1]. Object detection algorithms based on convolutional neural networks can be divided into three steps [2]: feature learning and object extraction, object classification and location regression, and non-maximum suppression algorithms to select the optimal detection boxes. Non-maximum suppression (NMS) in the last step was first proposed in the edge detection algorithm, and then further applied to the fields of object detection, face recognition, etc. [3], [4]. NMS is an important method for the post-processing step of a detection model. Current studies mainly focus on feature learning, feature extraction and classification, but there has been little improvement in non-maximum suppression algorithms [5].