K-DBSCAN: An improved DBSCAN algorithm for big data

Hamid Saadatfar

نویسندگان	Hamid Saadatfar
نشریه	Journal of Supercomputing
شماره صفحات	6214-6235
شماره سریال	77
شماره مجلد	6
ضریب تاثیر (IF)	1.326
نوع مقاله	Full Paper
تاریخ انتشار	2021
رتبه نشریه	ISI
نوع نشریه	الکترونیکی
کشور محل چاپ	ایران
نمایه نشریه	JCR،Scopus

چکیده مقاله

Abstract Big data storage and processing are among the most important challenges now. Among data mining algorithms, DBSCAN is a common clustering method. One of the most important drawbacks of this algorithm is its low execution speed. This study aims to accelerate the DBSCAN execution speed so that the algorithm can respond to big datasets in an acceptable period of time. To overcome the problem, an initial grouping was applied to the data in this article through the K-means++ algorithm. DBSCAN was then employed to perform clustering in each group separately. As a result, the computational burden of DBSCAN execution reduced and the clustering execution speed increased significantly. Finally, border clusters were merged if necessary. According to the results of executing the proposed algorithm, it managed to greatly reduce the DBSCAN execution time (98% in the best-case scenario) with no significant changes in the qualitative evaluation criteria for clustering.

لینک ثابت مقاله

tags: Data mining · Clustering · Big data · DBSCAN algorithm · K-means++ algorithm