IEEE 15th Signal Processing and Communications Applications Conference, Eskişehir, Türkiye, 11 - 13 Haziran 2007, ss.471-472
In this study, we investigated word error rate performance of Turkish continuous speech recognition system with sparse packet losses in a distributed architecture. In this distributed architecture, speech feature vectors consisting of MFCCs and logarithmic power are transmitted with UDP protocol. A special UDP header is defined to be in the distributed system. Sparse packet losses are artificially generated by considering different scenarios. Two packet loss concealment methods, Lagrange and Spline interpolation, are used as a front-end process in the recognition system. In the experimental study, speech feature vectors are obtained by using HTK The SRI Language Modelling toolkit is used to generate statistical language models. Acoustic modeling and recognition are performed using AT&T software. The Word Error Rate (WER) of the baseline system is 32.1% This error rate is increased up to 34.2% with the sparse packet losses. In our study, we have seen that the packet concealment methods reduce the WER of the speech recognition system to 32.4%.