Binaural Speech Enhancement using Deep Complex Convolutional Recurrent Networks
The demos consist of Binaural signals and headphones are recommended for listening.
- The binaural signals are generated using measured HRIRs from the Kayser Database [3]
- The speech signals are taken from the VCTK Corpus [2]
The listening files are organized as follows
- The input SNRs are -6, -3, 0, 6, 12 dB
Isotropic Noise at -6dB SNR
Anechoic speech with Isotropic Noise at -6 dB.
Anechoic speech with Isotropic Noise at -6 dB.
Anechoic speech with Isotropic Noise at -6 dB.
Isotropic Noise at -3dB SNR
Anechoic speech with Isotropic Noise at -3 dB.
Anechoic speech with Isotropic Noise at -3 dB.
Anechoic speech with Isotropic Noise at -3 dB.
Isotropic Noise at 0 dB SNR
Anechoic speech with Isotropic Noise at 0dB.
Anechoic speech with Isotropic Noise at 0dB.
Anechoic speech with Isotropic Noise at 0dB.
Isotropic Noise at 6 dB SNR
Anechoic speech with Isotropic Noise at 6 dB.
Anechoic speech with Isotropic Noise at 6 dB.
Anechoic speech with Isotropic Noise at 6 dB.
Isotropic Noise at 12 dB SNR
Anechoic speech with Isotropic Noise at 12 dB.
Anechoic speech with Isotropic Noise at 12 dB.
Anechoic speech with Isotropic Noise at 12 dB.
References
[1] Werner, Nils, et al. "trackswitch.js: A Versatile Web-Based Audio Player for Presenting Scientifc Results." 3rd web audio conference, London, UK. 2017.[2] H. Kayser, S. D. Ewert, J. Anemüller, T. Rohdenburg, V. Hohmann, and B. Kollmeier, “Database of multichannel in-ear and behind-the-Ear head-related and binaural room impulse responses,” EURASIP J. on Advances in Signal Process., vol. 2009, no.1, p.298605, Jul. 2009
[3] J. Yamagishi, C. Veaux, and K. MacDonald, “CSTR VCTK Corpus: English multispeaker corpus for CSTR voice cloning toolkit (version 0.92),” University of Edinburgh, The Centre for Speech Technology Research (CSTR), 2019
[4] C. Han, Y. Luo, and N. Mesgarani, “Real-Time Binaural Speech Separation with Preserved Spatial Cues,” in Proc. IEEE Int. Conf. on Acoust., Speech and Signal Process. (ICASSP),May 2020, pp. 6404–6408.
[5] V. Tokala, M. Brookes, and P. A. Naylor, “Binaural Speech Enhancement Using STOI-optimal Masks,” in 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), Sep. 2022, pp. 1–5