A unified language model for large vocabulary continuous speech recognition of Turkish

Arisoy, Ebru; Dutagaci, HELİN; Arslan, Levent

doi:10.1016/j.sigpro.2005.12.002

A unified language model for large vocabulary continuous speech recognition of Turkish

Arisoy E., Dutagaci H., Arslan L. M.

SIGNAL PROCESSING, cilt.86, ss.2844-2862, 2006 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 86
Basım Tarihi: 2006
Doi Numarası: 10.1016/j.sigpro.2005.12.002
Dergi Adı: SIGNAL PROCESSING
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.2844-2862
Anahtar Kelimeler: statistical language modeling, large vocabulary continuous speech recognition, Turkish newspaper content transcription, agglutinative language
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

We have designed a Turkish dictation system for newspaper content transcription application. Turkish is an agglutinative language with free word order. These characteristics of the language result in vocabulary explosion, large number of out-of-vocabulary (OOV) words and an increased complexity of n-gram language models in speech recognition when words are used as recognition units. In this paper, alternative language modeling units like "stems and endings", "stems and morphemes", and "syllables" are investigated instead of "words". These recognition units are compared in terms of vocabulary size, coverage, bigram perplexity and speech recognition performance. A combined model is proposed which aims to produce a balance between the OOV rate and the amount of phoneme sequence constraints on recognition units. The proposed model resulted in letter error rates (LER's) of approximately 28% for a speaker independent system and 20% for a speaker dependent system. These error rates are smaller compared to the traditional word-based model for newspaper content transcription application. (c) 2005 Elsevier B.V. All rights reserved.