AI Headphones Create Cones of Silence

Sidney Perkowitz

from IEEE Spectrum on 2024-12-03 13:00 (#6SNR7)

four-people-split-into-groups-of-two-each-with-a-different-colored-bubble-drawn-around-them-talking-in-the-same-room-one-pers.jpg?id=54983668&width=1245&height=700&coordinates=0%2C95%2C0%2C95

It's an experience we've all had: Whether catching up with a friend over dinner at a restaurant, meeting an interesting person at a cocktail party, or conducting a meeting amid office commotion, we find ourselves having to shout over background chatter and general noise. The human ear and brain are not especially good at identifying separate sources of sound in a noisy environment to focus on a particular conversation. This ability deteriorates further with general hearing loss, which is becoming more prevalent as people live longer, and can lead to social isolation.

However, a team of researchers from the University of Washington, Microsoft, and Assembly AI have just shown that AI can outdo humans in isolating sound sources to create a zone of silence. This sound bubble allows people within a radius of up to 2 meters to converse with hugely reduced interference from other speakers or noise outside the zone.

The group, led by University of Washington professor Shyam Gollakota, aims to combine AI with hardware to augment human capabilities. This is different, Gollakota says, from working with enormous computational resources such as those ChatGPT employs; rather, the challenge is to create useful AI applications within the limits of hardware constraints, particularly for mobile or wearable use. Gollakota has long thought that what has been called the cocktail party problem" is a widespread issue where this approach could be feasible and beneficial.

Currently, commercially available noise-canceling headsets suppress background noise but do not compensate for distances to the sound sources or other issues such as reverberations in enclosed spaces. Previous studies, however, have shown that neural networks achieve better separation of sound sources than conventional signal processing. Building on this finding, Gollakota's group designed an integrated hardware-AI hearable" system that analyzes audio data to clearly identify sound sources within and without a designated bubble size. The system then suppresses extraneous sounds in real time so there is no perceptible lag between what users hear, and what they see while watching the person speaking.

The audio part of the system is a commercial noise-canceling headset with up to six microphones that detect nearby and more distant sounds, providing data for neural-network analysis. Custom-built networks find the distances to sound sources and determine which of them lay inside a programmable bubble radius of 1 meter, 1.5 meters, or 2 meters. These networks were trained with both simulated and real-world data, taken in 22 rooms of varied sizes and sound-absorbing qualities with different combinations of human subjects. The algorithm runs on a small embedded CPU, either the Orange Pi or Raspberry Pi, and sends processed data back to the headphones in milliseconds, fast enough to keep hearing and vision in sync.

Hear the difference between a conversation with the noise-canceling headset turned on and off. Malek Itani and Tuochao Chen/Paul G. Allen School/University of Washington

The algorithm in this prototype reduced the sound volume outside the empty bubble by 49 decibels, to approximately 0.001 percent of theintensity recorded inside the bubble. Even in new acoustic environments and with different users, the system functioned well for up to two speakers in the bubble and one or two interfering outside speakers, even if they were louder. It also accommodated the arrival of a new speaker inside the bubble.

It's easy to imagine applications of the system in customizable noise-canceling devices, especially where clear and effortless verbal communication is needed in a noisy environment. The dangers of social isolation are well known, and a technology specifically designed to enhance person-to-person communication could help. Gollakota believes there's value in simply helping a person focus their auditory and spatial attention for personal interaction.

Sound-bubble technology could also eventually be integrated into hearing aids. Both Google and Swiss hearing-aid manufacturer Phonak have added AI elements to their earbuds and hearing aids, respectively. Gollakota is now considering how to put the sound-bubble approach into a comfortably wearable hearing-aid format. For that to happen, the device would have to fit into earbuds or a behind-each-ear configuration, wirelessly communicate between the left and right units, and operate all day on tiny batteries.

Gollakota is confident that this can be done. We are at a time when hardware and algorithms are coming together to support AI augmentation," he says. This is not about AI replacing jobs, but about having a positive impact on people through a human-computer interface."

Source	RSS or Atom Feed
Feed Location	http://feeds.feedburner.com/IeeeSpectrum
Feed Title	IEEE Spectrum
Feed Link	https://spectrum.ieee.org/