AASCIT - Journal - All Issues

ISSN Print: 2381-1218 ISSN Online: 2381-1226

Computational and Applied Mathematics Journal

The Journal

Manuscript Information

Speech Recognition Performance as Measure of Speech Dereverberation Quality

Computational and Applied Mathematics Journal
Vol.1 , No. 3, Publication Date: Apr. 18, 2015, Page: 60-66

1559 Views Since April 18, 2015, 1021 Downloads Since Apr. 18, 2015

Paper in PDF (1644K)

Follow on

Authors

[1]	Arkadiy Prodeus, Acoustic and Electroacoustic Department, Faculty of Electronics, NTUU KPI, Kyiv, Ukraine.

Abstract

Optimal, in the sense of automatic speech recognition (ASR) accuracy maximum, parameters of the late reverberation suppression technique have been proposed in this paper. It was shown that the value 50 ms as boundary between early reflections and late reverberation, which usually is used when problems of speech quality and intelligibility is studied, isn’t best for ASR systems, for which optimal value is 100 ms. It was shown also that, when estimating late reverberation power spectrum, an optimal value of averaging parameter should be associated with statistical speech constants such as phoneme and stationary durations. Several speech quality indicators were used, and it was found that recognition accuracy is the best indicator in the sense of ability to inform the user about reached compromise between reverberation suppression and speech distortion.

Keywords

Late Reverberation Suppression, Optimal Parameters Values, Speech Quality, Speech Recognition Accuracy

Reference

[01]	P. Naylor,N. Gaubitch, Speech Dereverberation, Springer-Verlag: London, 2010.
[02]	T. Yoshiokaet al., “Making mashine understand usinreverberantrooms,”IEEE Signal Processing Magazine, Vol. 29, pp. 114-126, November 2012.
[03]	K. Lebart, J. Boucher, P. Denbigh, “A new method based on spectral subtraction for speech dereverberation,” Acta Acoustica, Vol. 87, pp. 359-366, April 2001.
[04]	E. Habets, Single- and Multi-Microphone Speech Dereverberation using Spectral Enhancement, PhD dissertation, Eindhoven, 2007, 257 p.
[05]	Y. Ephraim, D. Malah,“Speech enhancement using a minimum mean square error Log-spectral amplitude estimator,” IEEE Trans. Acoust., Speech, Signal Processing,Vol. ASSP-33,pp. 443-445,April 1985.
[06]	A. Prodeus, O. Ladoshko,“On existance of optimal boundary value between early reflections and late reverberation,”Proc. of IEEE 34th Int. Sc. Conf. Electronics and Nanotechnology (ELNANO), pp. 442-446, 15-18 April 2014, Kyiv, Ukraine.
[07]	A. Prodeus,“Parameters Optimizing of Late Reverberation Spectrum Estimator,” Proc. Xth Int. Conf.“Perspective Technologies and Methods in MEMs Design” (MEMSTECH 2014),pp. 100-103,22-24 June 2014, Lviv, Ukraine.
[08]	A. Prodeus, V. Didkovskiy, V. Ovsianyk,“Blind estimation of reverberation time in automatic speech recognition systems,”Information processing systems, No. 7(123), pp. 59-66,Kharkiv, September 2014 (in Russian).
[09]	A. Prodeus, O. Ladoshko, “Reverberation suppression systems quality indicators dependency on speech distortion level,” Standartisation, sertification, quality,No. 3(88), pp. 45-49, June 2014 (in Ukrainian).
[10]	C.M. Chernick, S. Leigh, K.L. Mills, and R. Toense, “Testing the Ability of Speech Reconizers to Measure the Effectiveness of Encoding Algorithms for Digital Speech Transmission,” IEEE Int. Military Comm. Conf. (MILCOM), 1999.
[11]	W. Jiang, H. Schulzrinne,"Speech Recognition Performance as an Effective Perceived Quality Predictor," IEEE Int. Workshop on Quality of Service, pp. 269-275, 2002.
[12]	W. Liu, K. Jellyman, J. Mason, N. Evans, "Assessment of Objective Quality Measures for Speech Intelligibility Estimation," IEEE Int. Conf. on Acoustics, Speech and Signal Processing, (ICASSP 2006), Vol. 1, May 14-19, 2006.
[13]	J. Beerends, E. Larsen, N. Iyer, J. van Vugt, “Measurement of speech intelligibility based on the PESQ approach,”Proc. Int. Conf. “Measurement of Speech and Audio Quality in Networks” (MESAQIN), Prague, Czech Republic, 2 June 2004.
[14]	S. Young, HMMs and Related Speech Recognition Technologies, Springer Handbook of Speech Processing, ed. J.Benesty et al., Berlin Heidelberg: Springer-Verlag, 2008.
[15]	R.M. Schroeder,“New method of measuring reverberation time,”J. Acoust. Soc. Am., Vol. 37, pp. 409-412, 1965.
[16]	J. R. Deller, J. G. Proakis, J. H. L. Hansen, Discrete-Time Processing of Speech Signals. Macmillan Publishing Company, New York, NY, 1993.
[17]	B. Ziolko, M. Ziolko, “Time durations of phonemes in polish language for speech and speaker recognition,” in Human Language Technology, Vol. 6562. Berlin Heidelberg: Springer-Verlag, 2011, pp.105-114.