ClearBuds resulted from a research at Univ of Washington

We’ve had active noise canceling in wireless earbuds for a while, but that mostly benefits the person wearing the headphones to drown out the outside world. If you’ve been on the other end of a phone call with someone wearing ’em, you’ll notice that the microphones still pick up a lot other than the voice you’re trying to pay attention to. That’s what the open source ClearBuds project is trying to resolve, by adding a layer of deep-learning and audio processing to the mix.

I can ramble on for a few thousand words here (and I still might), but if a picture is worth 1,000 words, then a 23-second 30 FPS video is worth almost 700,000 words, and I just can’t compete with that. Check it out:

The ClearBuds project is a result of a research initiative by three University of Washington researchers, who were roommates during the pandemic. The system includes a microphone system and real-time machine-learning systems that can run on a smartphone.

Most earbuds only use the audio from one of the buds to send audio to the phone. The ClearBuds system sends two streams that can then be analyzed and processed quickly enough to be used for live audio, such as video or phone calls. The team’s algorithm suppresses any non-voice sounds, then enhances the speaker’s voice.

“ClearBuds differentiate themselves from other wireless earbuds in two key ways,” said co-lead author Maruchi Kim, a doctoral student in the Paul G. Allen School of Computer Science & Engineering. “First, ClearBuds use a dual microphone array. Microphones in each earbud create two synchronized audio streams that provide information and allow us to spatially separate sounds coming from different directions with higher resolution. Second, the lightweight neural network further enhances the speaker’s voice.”

“Because the speaker’s voice is close by and approximately equidistant from the two earbuds, the neural network can be trained to focus on just their speech and eliminate background sounds, including other voices,” said co-lead author Ishan Chatterjee. “This method is quite similar to how your own ears work. They use the time difference between sounds coming to your left and right ears to determine from which direction a sound came from.”

Check out the full project page, and cross your fingers that this tech finds its way into some headphones soon, because, frankly, I can’t wait to not hear barking dogs, zooming cars and my niece singing we don’t talk about Bruno-no-no in the background. Okay, let’s be honest, I’ll miss the singing. Everything else can go, though.