A New Frequency Domain Method for Blind Source Separation of Convolutive Audio Mixtures

To appear IEEE Transactions on Speech and Audio Processing
Kamran Rahbar and James P. Reilly

Abstract   In this paper we propose a new frequency domain approach to blind source separation (BSS) of audio signals mixed in a reverberant environment. It is first shown that joint diagonalization of the cross power spectral density matrices of the signals at the output of the mixing system is sufficient to identify the mixing system at each frequency bin up to a scale and permutation ambiguity. The frequency domain joint diagonalization is performed using a new and quickly converging algorithm which uses an alternating least-squares (ALS) optimization method. The inverse of the mixing system, estimated using the joint diagonalization algorithm, is then used to separate the sources. An efficient diadic algorithm to resolve the frequency dependent permutation ambiguities that exploits the inherent non-stationarity of the sources is presented. The effect of the unknown scaling ambiguities is partially resolved using a novel initialization procedure for the ALS algorithm.

The performance of the proposed algorithm is demonstrated by experiments conducted in real reverberant rooms. The algorithm demonstrates good separation performance and enhanced output audio quality. The proposed algorithm is compared to the recent work of Parra. Audio results are available at "www.ece.mcmaster.ca/~ reilly/kamran/index.htm".


 

go back