NON-SPEECH ENVIRONMENTAL SOUND CLASSIFICATION USING SVMS WITH A NEW SET OF FEATURES

Uzkent, Burak; Barkana, Buket; ÇEVİKALP, HAKAN

NON-SPEECH ENVIRONMENTAL SOUND CLASSIFICATION USING SVMS WITH A NEW SET OF FEATURES

Atıf İçin Kopyala

Uzkent B., Barkana B. D., ÇEVİKALP H.

INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, cilt.8, ss.3511-3524, 2012 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 8
Basım Tarihi: 2012
Dergi Adı: INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.3511-3524
Anahtar Kelimeler: Environmental sound classification, Feature extraction, Mel-frequency cepstral coefficients (MFCCs), Support vector machines, Radial basis function (RBF) neural network, SUPPORT VECTOR MACHINES, RECOGNITION
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

Mel Frequency Cepstrum Coefficients (MFCCs) are considered as a method of stationary/pseudo-stationary feature extraction. They work very well for the classification of speech and music signals. MFCCs have also been used to classify non-speech sounds for audio surveillance systems, even though MFCCs do not completely reflect the time-varying features of non-stationary non-speech signals. We introduce a new 2D-feature set, used with a feature extraction method based on the pitch range (PR) of non-speech sounds and the Autocorrelation Function. We compare the classification accuracies of the proposed features of this new method to MFCCs by using Support Vector Machines (SVMs) and Radial Basis Function Neural Network classifiers. Non-speech environmental sounds: gunshot, glass breaking, scream, dog barking, rain, engine, and restaurant noise, were studied. The new feature set provides high accuracy rates when used as a classifier. Its usage with MFCCs significantly improves the accuracy rates of the given classifiers in the range of 4% to 35% depending on the classifier used, suggesting that both feature sets are complementary. SVM classifier using the Gaussian kernel provided the highest accuracy rates among the classifiers used in this study.