Article 6N7A6 AI Headphones Let Wearer Listen to a Single Person in a Crowd, by Looking at Them Just Once

AI Headphones Let Wearer Listen to a Single Person in a Crowd, by Looking at Them Just Once

by
janrinok
from SoylentNews on (#6N7A6)

DannyB writes:

AI headphones let wearer listen to a single person in a crowd, by looking at them just once

Noise-canceling headphones have gotten very good at creating an auditory blank slate. But allowing certain sounds from a wearer's environment through the erasure still challenges researchers. The latest edition of Apple's AirPods Pro, for instance, automatically adjusts sound levels for wearers - sensing when they're in conversation, for instance - but the user has little control over whom to listen to or when this happens.

A University of Washington team has developed an artificial intelligence system that lets a user wearing headphones look at a person speaking for three to five seconds to "enroll" them. The system, called "Target Speech Hearing," then cancels all other sounds in the environment and plays just the enrolled speaker's voice in real time even as the listener moves around in noisy places and no longer faces the speaker.

[....] To use the system, a person wearing off-the-shelf headphones fitted with microphones taps a button while directing their head at someone talking. The sound waves from that speaker's voice then should reach the microphones on both sides of the headset simultaneously; there's a 16-degree margin of error. The headphones send that signal to an on-board embedded computer, where the team's machine learning software learns the desired speaker's vocal patterns. The system latches onto that speaker's voice and continues to play it back to the listener, even as the pair moves around. The system's ability to focus on the enrolled voice improves as the speaker keeps talking,

[...] Currently the TSH system can enroll[sic] only one speaker at a time, and it's only able to enroll [sic] a speaker when there is not another loud voice coming from the same direction as the target speaker's voice. If a user isn't happy with the sound quality, they can run another enrollment on the speaker to improve the clarity.

It would seem the embedded single board computer would get very good at only allowing the voice of a speaker who talks too much.

See YouTube video: here.

Original Submission

Read more of this story at SoylentNews.

External Content
Source RSS or Atom Feed
Feed Location https://soylentnews.org/index.rss
Feed Title SoylentNews
Feed Link https://soylentnews.org/
Feed Copyright Copyright 2014, SoylentNews
Reply 0 comments