Identification of correct number of clusters
with high degree of overlap
O. Ammor1*, A. Lachkar2, K. Slaoui3, N. Rais4
1 Laboratoire de Modélisation et Calcul Scientifique. FSTF Université Sidi Mohammed Ben AbdellahFès, Morocco.
2 Laboratoire LEIB. Dept Informatique. E.S.T.M, Université Moulay Ismail. Morocco
3 Laboratoire LESSI .Faculté des sciences Dhar Mehraz, Université Sidi Mohammed Ben Abdellah, Fès, Morocco
4 Laboratoire ISQ. Faculté des Sciences Dhar Mehraz, Université Sidi Mohammed Ben Abdellah Fès, Morocco
* Corresponding author. E-mail:firstname.lastname@example.org
Received: 12 November 2006; revised version accepted:06 March 2007
Many validity indexes have been proposed to evaluate clustering results. They usually have a tendency to fail in selecting the right number of clusters when dealing with overlapping clusters such as the IRIS data. To overcome this limitation, in this paper, a new cluster validity index based on Maximum Entropy Principle, named VMEP, is proposed. VMEP allows finding the correct number of clusters, and can deal successfully with or without the presence of overlap, even when this later is higher between clusters. Many simulated and real examples are presented, showing the superiority of VMEP to the existing indexes.
Keywords: Clusters validity; Optimal Clusters Number; Overlapping Clusters; Unsupervised Clustering; Maximum Entropy Principe (MEP).