Multichannel Binaural Speech Enhancement using Deep Complex Convolutional Recurrent Networks
The demos consist of Binaural signals and headphones are recommended for listening.
- The binaural signals are generated using measured HRIRs from the Kayser Database [3]
- The speech signals are taken from the VCTK Corpus [2]
- The Code for implementation can be accessed from below
This page was generated using trackswitch.js in [1].
The listening files are organized as follows
- The input SNRs are -6, -3, 0, 6, 12 dB
Isotropic Noise at -6dB SNR
Anechoic speech with Isotropic Noise at -6 dB.
Anechoic speech with Isotropic Noise at -6 dB.
Isotropic Noise at -3dB SNR
Anechoic speech with Isotropic Noise at -3 dB.
Anechoic speech with Isotropic Noise at -3 dB.
Isotropic Noise at 0 dB SNR
Anechoic speech with Isotropic Noise at 0 dB.
Anechoic speech with Isotropic Noise at 0 dB.
Isotropic Noise at 6 dB SNR
Anechoic speech with Isotropic Noise at 6 dB.
Anechoic speech with Isotropic Noise at 6 dB.
Isotropic Noise at 12 dB SNR
Anechoic speech with Isotropic Noise at 12 dB.
Anechoic speech with Isotropic Noise at 12 dB.(Files to be updated)
References
[1] Werner, Nils, et al. "trackswitch.js: A Versatile Web-Based Audio Player for Presenting Scientifc Results." 3rd web audio conference, London, UK. 2017.[2] H. Kayser, S. D. Ewert, J. Anemüller, T. Rohdenburg, V. Hohmann, and B. Kollmeier, “Database of multichannel in-ear and behind-the-Ear head-related and binaural room impulse responses,” EURASIP J. on Advances in Signal Process., vol. 2009, no.1, p.298605, Jul. 2009
[3] J. Yamagishi, C. Veaux, and K. MacDonald, “CSTR VCTK Corpus: English multispeaker corpus for CSTR voice cloning toolkit (version 0.92),” University of Edinburgh, The Centre for Speech Technology Research (CSTR), 2019
[4] C. Han, Y. Luo, and N. Mesgarani, “Real-Time Binaural Speech Separation with Preserved Spatial Cues,” in Proc. IEEE Int. Conf. on Acoust., Speech and Signal Process. (ICASSP),May 2020, pp. 6404–6408.
[5] V. Tokala, M. Brookes, and P. A. Naylor, “Binaural Speech Enhancement Using STOI-optimal Masks,” in 2022 International Workshop on Acoustic Signal Enhancement (IWAENC), Sep. 2022, pp. 1–5