Interspeech 2008

Ron had a paper accepted to Interspeech this year about adding speech models (source priors) to MESSL. It is entitled, “Source separation based on binaural cues and source model constraints.” As much as I’d like to go, Brisbane, Australia is a bit farther than Pittsburgh was. Here’s the abstract:

We describe a system for separating multiple sources from a two-channel recording based on interaural cues and known characteristics of the source signals. We combine a probabilistic model of the observed interaural level and phase differences with a prior model of the source statistics and derive an EM algorithm for finding the maximum likelihood parameters of the joint model. The system is able to separate more sound sources than there are observed channels. In simulated reverberant mixtures of three speakers the proposed algorithm gives a signal-to-noise ratio improvement of 2.1 dB over a baseline algorithm using only interaural cues.

2 Responses to “Interspeech 2008”

  1. greggT Says:

    a big issue in HD over-the-air transmissions is the signal bouncing off buildings (delaying it) and then arriving at an antenna multiple times. so could you use this separation technique to identify the main signal and filter out the reverb’s, leaving just a pure signal?

  2. mim Says:

    I agree that the analogy is striking. In fact, the wavelengths of sound that humans can hear are between 20 meters and 2 cm and the frequencies of electromagnetic radiation at those same frequencies are 20 MHz to 20 GHz, right in the radio/microwave regions. Of course, sound covers that whole bandwidth at once, while radio transmissions are quite narrowband.

    There’s a lot of research going on at the moment on multiple input, multiple output communication systems, like 802.11n, that are designed to handle such situations. I haven’t looked into it that closely, but it seems like that research mostly focuses on coding the transmission or beamforming at the transmitter or receiver, which are less interesting to me. I think it would be cool to make a radio that could hear radio waves like humans hear sound waves, removing echoes, separating overlapping sources, etc. Such a technology might substantially increase data rates.

Leave a Reply