Automatic clustering of big datasets using a swarm intelligence  method

Iman Behravan; Seyed Hamid Zahiri; Seyed Mohammad Razavi, Roberto Trasarti

Authors	Iman Behravan-Seyed Hamid Zahiri- Seyed Mohammad Razavi, Roberto Trasarti
Conference Title	international congress and exhibition of sciences and innovative technologies
Holding Date of Conference	2018-09
Event Place	Babol
Presentation	SPEECH
Conference Level	International Conferences
Keywords	Automatic clustering, Big data analytics, K-means, swarm intelligence

Abstract

Mining and discovering knowledge from big datasets have become a new interesting field
of research among data scientists. In fact, extracting hidden patterns in big datasets using
traditional data mining algorithms in a reasonable period of time and with an acceptable
accuracy is impossible due to high volume of data and their complexity. Generally, the
term big data is referred to massive datasets with huge number of high dimensional
samples which makes them very hard to be analyzed by conventional data mining
techniques. So designing new and effective algorithms for analyzing big datasets is
necessary. Clustering, which is the process of dividing the data points into different groups
based on their similarities and dissimilarities, is one of the most important data mining and
big data mining methods. K-means, which is one of the most popular clustering algorithms
and has been widely used in several researches, suffers from some drawbacks such as: its
tendency to converge to a local optimum point, the quality of its final results depends on
the initial centroids generated randomly and its inability in finding the number of clusters.
In this paper a new automatic big data clustering method, based on a swarm intelligence
algorithm, is introduced which has a great ability in finding the number of clusters and
escaping from local optimum point. The proposed method is tested on 13 synthetics and 2
real big mobility datasets. Final results demonstrate its power in big data clustering.

Iman Behravan

Assistant Professor Iman Behravan

Automatic clustering of big datasets using a swarm intelligence method

Abstract