Some Versions of K-means Clustering Method and its Comparative Study in Low and High Dimensional Data

Publication Date: 09/03/2020


Author(s): Oti Eric Uchenna, Onyeagu Sidney Iheanyi.

Volume/Issue: Volume 3 , Issue 1 (2020)



Abstract:

In this paper we present some versions of k-means clustering method and compare the methods using simulated data, and also low and high dimensional data set in terms of their accuracy and minimized total intra-cluster variance. The versions of k-means clustering method discussed in this paper are namely: The Forgy’s method, Lloyd’s method, MacQueen’s method, Hartigan and Wong’s method, Likas’ method and Faber’s method. These methods minimize a given criterion by iteratively relocating points between clusters until a locally optimal partition is attained. In a basic iterative algorithm, such as k-means, convergence is local and the globally optimum solution cannot be guaranteed. From experimental results, it was observed that Likas’ method and Faber’s method performed better in our synthetic data; method like Likas’ performed better in low dimensional data (iris data) while Hartigan and Wong’s method did better in high dimensional data (yeast cell cycle data).



No. of Downloads: 15

View: 528




This article is published under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
CC BY-NC-ND 4.0