A unified language model for large vocabulary continuous speech recognition of Turkish


Arisoy E., Dutagaci H., Arslan L. M.

SIGNAL PROCESSING, cilt.86, ss.2844-2862, 2006 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 86
  • Basım Tarihi: 2006
  • Doi Numarası: 10.1016/j.sigpro.2005.12.002
  • Dergi Adı: SIGNAL PROCESSING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.2844-2862
  • Anahtar Kelimeler: statistical language modeling, large vocabulary continuous speech recognition, Turkish newspaper content transcription, agglutinative language
  • Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

We have designed a Turkish dictation system for newspaper content transcription application. Turkish is an agglutinative language with free word order. These characteristics of the language result in vocabulary explosion, large number of out-of-vocabulary (OOV) words and an increased complexity of n-gram language models in speech recognition when words are used as recognition units. In this paper, alternative language modeling units like "stems and endings", "stems and morphemes", and "syllables" are investigated instead of "words". These recognition units are compared in terms of vocabulary size, coverage, bigram perplexity and speech recognition performance. A combined model is proposed which aims to produce a balance between the OOV rate and the amount of phoneme sequence constraints on recognition units. The proposed model resulted in letter error rates (LER's) of approximately 28% for a speaker independent system and 20% for a speaker dependent system. These error rates are smaller compared to the traditional word-based model for newspaper content transcription application. (c) 2005 Elsevier B.V. All rights reserved.