Visual Object Detection Using Cascades of Binary and One-Class Classifiers

ÇEVİKALP, HAKAN; Triggs, Bill

doi:10.1007/s11263-016-0986-2

Visual Object Detection Using Cascades of Binary and One-Class Classifiers

Atıf İçin Kopyala

ÇEVİKALP H., Triggs B.

INTERNATIONAL JOURNAL OF COMPUTER VISION, cilt.123, sa.3, ss.334-349, 2017 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 123 Sayı: 3
Basım Tarihi: 2017
Doi Numarası: 10.1007/s11263-016-0986-2
Dergi Adı: INTERNATIONAL JOURNAL OF COMPUTER VISION
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.334-349
Anahtar Kelimeler: Object detection, Rejection cascade, One-Class Classifier, Latent training, SUPPORT, SYSTEM
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

We describe an efficient approach to visual object detection that uses short cascades of asymmetric 'one class' classifiers to quickly reject negatives (windows not centered on an object of the desired class) within a sliding window framework. Current detectors typically use binary discriminants such as Support Vector Machines or Boosting to implement each stage of the cascade. These treat the positive and negative classes symmetrically. We argue that this is suboptimal because object detectors typically see a great many negative windows with extremely diverse contents and only a few positive ones with comparatively coherent contents. We show that asymmetric representations that focus on tightly modeling the extent of the rare, coherent positive class can lead to simpler classifiers and faster rejection. Our cascades use asymmetric classifiers based on simple convex models to progressively tighten the bound on the positive class. They typically start with a conventional linear SVM for initial pruning, followed by a cascade of linear distance-to-hyperplane and interior-of-hypersphere classifiers and finishing with a kernelized hypersphere classifier. We show that the resulting detectors have competitive performance on the Labeled Faces in the Wild dataset and state-of-the-art performance on the FDDB face detection, ESOGU face detection and INRIA Person datasets. The results on the PASCAL VOC 2007 dataset are also respectable given that they use neither object parts nor context. The one-class formulations provide significant reductions in classifier complexity relative to the corresponding two-class ones, making them suitable for real-world applications.