خوشه بندی اوج چگالی بهبود یافته
ترجمه نشده

خوشه بندی اوج چگالی بهبود یافته

عنوان فارسی مقاله: خوشه بندی اوج چگالی بهبود یافته مبتنی بر مجاوران مشترک هسته های محلی برای مجموعه داده های فراوان
عنوان انگلیسی مقاله: Improved Density Peaks Clustering Based on Shared-Neighbors of Local Cores for Manifold Data Sets
مجله/کنفرانس: دسترسی – IEEE Access
رشته های تحصیلی مرتبط: مهندسی کامپیوتر
گرایش های تحصیلی مرتبط: مهندسی الگوریتم و محاسبات
کلمات کلیدی فارسی: مجاوران مشترک، هسته های محلی، اوج چگالی، خوشه بندی
کلمات کلیدی انگلیسی: Shared-neighbors, local cores, density peaks, clustering
نوع نگارش مقاله: مقاله پژوهشی (Research Article)
شناسه دیجیتال (DOI): https://doi.org/10.1109/ACCESS.2019.2948422
دانشگاه: College of Big Data and Intelligent Engineering, Yangtze Normal University, Chongqing 408100, China
صفحات مقاله انگلیسی: 11
ناشر: آی تریپل ای - IEEE
نوع ارائه مقاله: ژورنال
نوع مقاله: ISI
سال انتشار مقاله: 2019
ایمپکت فاکتور: 4.641 در سال 2018
شاخص H_index: 56 در سال 2019
شاخص SJR: 0.609 در سال 2018
شناسه ISSN: 2169-3536
شاخص Quartile (چارک): Q2 در سال 2018
فرمت مقاله انگلیسی: PDF
وضعیت ترجمه: ترجمه نشده است
قیمت مقاله انگلیسی: رایگان
آیا این مقاله بیس است: خیر
آیا این مقاله مدل مفهومی دارد: ندارد
آیا این مقاله پرسشنامه دارد: ندارد
آیا این مقاله متغیر دارد: ندارد
کد محصول: E13882
رفرنس: دارای رفرنس در داخل متن و انتهای مقاله
فهرست مطالب (انگلیسی)

Abstract

I. Introduction

II. Related Works

III. Preliminaries

IV. The Proposed Algorithm

V. Experimental Analysis

Authors

Figures

References

بخشی از مقاله (انگلیسی)

Abstract

A novel clustering algorithm by fast search and find of density peaks (DP) was proposed in Science, 2014. It has attracted much attention from researchers. It can easily select clusters centers with decision graph. However, it cannot be used to cluster manifold data sets as the existing distance measurement is not suitable to evaluate the dissimilarity between objects on manifold structure. Some researchers use graph-based distance to measure the dissimilarity between objects on manifold clusters, but computing the graph-based distance on the original data set is time consuming. An improved density peaks clustering algorithm based on shared-neighbors between local cores, SLORE-DP, is proposed in this paper. First, it finds local cores to represent the data set and redefines the graph-based distance between local cores with sharedneighbors-based distance. Then natural neighbor-based density and the new defined graph-based distance are used to construct decision graph on local cores and DP algorithm is employed to cluster local cores. Finally, the remaining points are assigned to the same cluster as their local cores belong to. Since we use the new defined graph-based distance to estimate the dissimilarity between local cores, SLORE-DP can be used to cluster manifold data sets and at the same time it only calculates the shortest path between local cores, which greatly reduces the running time of the algorithm. We do experiments on several synthetic data sets containing manifold clusters and several real data sets from UCI. The results show that SLORE-DP is more effective and efficient than other algorithms when clustering manifold data sets.

Introduction

As an unsupervised learning, clustering is an important method for data analysis. It has been widely used in the field of pattern recognition, image processing, and information retrieval. It is designed to divide objects into multiple clusters, so that similar objects are in the same cluster while different objects are in different clusters. Many clustering algorithms have been proposed over the past few decades. According to different strategies, these algorithms can be roughly grouped into partitioning methods, density-based methods, hierarchical methods, model-based methods and grid-based methods. Among them, partitioning, density-based and hierarchical algorithms, due to their simple principle, are the most popular. K-means [1] and K-medoids [2] are typical partitioning algorithms. However, their performance depends on the selection of initial cluster centers. To avoid selecting cluster centers, AP algorithm [3] treats all objects as potential centers. K-AP [4] is an improved AP algorithm. It uses the immediate result of K clusters by introducing a constraint in the process of message passing. However, since each point is always allocated to the nearest center, these algorithms cannot discover arbitrary-shaped clusters. DBSCAN [5] is a typical density-based clustering algorithm. It defines clusters as dense regions separated by sparse regions. Dcore [6] is a hybrid decentralized approach which is based on finding density cores instead of centroids.