New K-Means Clustering Methods that Minimizes the Total Intra-Cluster Variance

Publication Date: 12/11/2020


Author(s): Eric U. Oti, Sidney I. Onyeagu, Chike H. Nwankwo, Waribi K. Alvan, George A. Osuji.

Volume/Issue: Volume 3 , Issue 5 (2020)



Abstract:

In this paper, we present new k-means clustering methods namely: the modified k-means method and the enhanced k-means method. The modified k-means clustering method proposed updates cluster centroids depending on if a point is added to a cluster or a point is removed from a cluster; while the enhanced k-means clustering method uses the Minkowski’s distance as its metric in a normed vector space instead of the usual Euclidean distance used in the modified k-means method and the existing methods. K-means clustering is one of the simplest and popular unsupervised learning techniques which aim is to classify points or objects to be analyzed into well separated groups or clusters. The existing k-means clustering methods discussed in this paper are the Forgy’s method, Lloyd’s method, MacQueen’s method, and the Hartigan and Wong’s method. It was observed that the modified k-means method performed relatively better than the enhanced k-means method and the other existing methods in terms of minimizing the total intra-cluster variance and accuracy using simulated data and real-life data sets.


Keywords:

K-Means Clustering, Centroid Update, Euclidean Distance, Intra-Cluster Variance, Unsupervised Classification


No. of Downloads: 33

View: 460




This article is published under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
CC BY-NC-ND 4.0