A Pitman measure of similarity in k-means for clustering heavy-tailed data

AuthorsJavad Etminan,
JournalCommunications in Statistics Part B: Simulation and Computation
Page number1595-1605
Serial number48
Volume number6
IF0.457
Paper TypeFull Paper
Published At2019
Journal GradeISI
Journal TypeTypographic
Journal CountryIran, Islamic Republic Of
Journal IndexJCR،Scopus

Abstract

One of the most popular methods and algorithms to partition data to k clusters is k-means clustering algorithm. Since this method relies on some basic conditions such as, the existence of mean and finite variance, it is unsuitable for data that their variances are infinite such as data with heavy tailed distribution. Pitman Measure of Closeness (PMC) is a criterion to show how much an estimator is close to its parameter with respect to another estimator. In this article using PMC, based on k-means clustering, a new distance and clustering algorithm is developed for heavy tailed data.

Paper URL

tags: α-stable distributions; α-sub-Gaussian distributions; Heavy tail distributions; k-means clustering; Pitman measure of closeness