Robust and compact maximum margin clustering for high-dimensional data


Creative Commons License

ÇEVİKALP H., Chome E.

NEURAL COMPUTING & APPLICATIONS, cilt.36, sa.11, ss.5981-6003, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 36 Sayı: 11
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s00521-023-09388-x
  • Dergi Adı: NEURAL COMPUTING & APPLICATIONS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, Biotechnology Research Abstracts, Compendex, Computer & Applied Sciences, Index Islamicus, INSPEC, zbMATH
  • Sayfa Sayıları: ss.5981-6003
  • Anahtar Kelimeler: Hyperplane fitting, Large margin, Maximum margin clustering, Robust clustering, Subspace clustering
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

In the field of machine learning, clustering has become an increasingly popular research topic due to its critical importance. Many clustering algorithms have been proposed utilizing a variety of approaches. This study focuses on clustering of high-dimensional data using the maximum margin clustering approach. In this paper, two methods are introduced: The first method employs the classical maximum margin clustering approach, which separates data into two clusters with the greatest margin between them. The second method takes cluster compactness into account and searches for two parallel hyperplanes that best fit to the cluster samples while also being as far apart from each other as possible. Additionally, robust variants of these clustering methods are introduced to handle outliers and noise within the data samples. The stochastic gradient algorithm is used to solve the resulting optimization problems, enabling all proposed clustering methods to scale well with large-scale data. Experimental results demonstrate that the proposed methods are more effective than existing maximum margin clustering methods, particularly in high-dimensional clustering problems, highlighting the efficacy of the proposed methods.