Reaching Nirvana: Maximizing the Margin in Both Euclidean and Angular Spaces for Deep Neural Network Classification


ÇEVİKALP H., Saribas H., UZUN B.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1109/tnnls.2024.3437641
  • Dergi Adı: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Aerospace Database, Applied Science & Technology Source, Biotechnology Research Abstracts, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, Civil Engineering Abstracts
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

The classification loss functions used in deep neural network classifiers can be split into two categories based on maximizing the margin in either Euclidean or angular spaces. Euclidean distances between sample vectors are used during classification for the methods maximizing the margin in Euclidean spaces whereas the Cosine similarity distance is used during the testing stage for the methods maximizing the margin in the angular spaces. This article introduces a novel classification loss that maximizes the margin in both the Euclidean and angular spaces at the same time. This way, the Euclidean and Cosine distances will produce similar and consistent results and complement each other, which will in turn improve the accuracies. The proposed loss function enforces the samples of classes to cluster around the centers that represent them. The centers approximating classes are chosen from the boundary of a hypersphere, and the pair-wise distances between class centers are always equivalent. This restriction corresponds to choosing centers from the vertices of a regular simplex inscribed in a hypersphere. The proposed loss function can be effortlessly applied to classical classification problems as there is a single hyperparameter that must be set by the user, and setting this parameter is straightforward. Additionally, the proposed method can effectively reject test samples from unfamiliar classes by measuring their distances from the known class centers, which are compactly clustered around their corresponding centers. Therefore, the proposed technique is especially suitable for open set recognition problems. Despite its simplicity, experimental studies have demonstrated that the proposed method outperforms other techniques in both open set recognition and classical classification problems.