Hybrid DNN-based single microphone speech enhancement

E. Chazan, J. Goldberger, and S. Gannot, “A hybrid approach for speech enhancement using MoG model and neural network phoneme classifier,” IEEE Tran. on Audio, Sp., and Lang. Proc., submitted for publication, revised Aug. 2016 (NN-MM).

Signal Source: TIMIT, Noise Source:Babble , SNR: 5 DB

Clean 0:08 Noisy 0:08 MiXMaX 0:08 OMLSA 0:08 NN-MM 0:08

Signal Source: TIMIT, Noise Source: Room , SNR: 5 DB

Clean 0:05 Noisy 0:05 MiXMaX 0:05 OMLSA 0:05 NN-MM 0:05

Signal Source: WSJ, Noise Source: Factory , SNR: 5 DB

Clean 0:08 Noisy 0:08 MixMax 0:08 OMLSA 0:08 NN-MM 0:08

Signal Source: WSJ, Noise Source: Siren , SNR: 5 DB

Clean 0:08 Noisy 0:08 MixMax 0:08 OMLSA 0:08 NN-MM 0:08

We compare the proposed algorithm with two algorithms:

MiXMaX:

D. Burshtein and S. Gannot, “Speech enhancement using a mixture-maximum model,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 6, pp. 341-351, Sep. 2002.

OMLSA:

I. Cohen and B. Berdugo, “Speech enhancement for non-stationary noise environments,” Signal processing, vol. 81, no. 11, pp. 2403–2418, 2001.