To better understand speech, focus on who's talking: Study
Oct 30, 2021
Washington [US], October 30 : A recent study led by an international team of researchers suggested that matching the locations of faces with the speech sounds they are producing significantly improves the ability to understand them, especially in noisy areas where other talkers are present.
In the Journal of the Acoustical Society of America, published by the Acoustical Society of America through AIP Publishing, researchers from Harvard University, University of Minnesota, University of Rochester, and Carnegie Mellon University outline a set of online experiments that mimicked aspects of distracting scenes to learn more about how we focus on one audio-visual talker and ignore others.
"If there's only one multisensory object in a scene, our group and others have shown that the brain is perfectly willing to combine sounds and visual signals that come from different locations in space," said author Justin Fleming. "It's when there's a multisensory competition that spatial cues take on more importance."
The researchers first asked participants to pay attention to one talker's speech and ignore another talker, either when corresponding faces and voices originated from the same location or different locations. Participants performed significantly better when the face matched where the voice was coming from.
Next, they found task performance decreased when participants directed their gaze toward a voice trying to distract them.
Finally, the researchers showed spatial alignment between faces and voices was more important when the background noise was louder, suggesting the brain makes more use of audio-visual spatial cues in challenging sensory environments.
The pandemic forced the group to get creative about conducting such research with participants over the internet.
"We had to learn about and, in some cases, create several tasks to make sure participants were seeing and hearing the stimuli properly, wearing headphones, and following instructions," Fleming said.
Fleming hopes their findings will lead to improved designs for hearing devices and better handling of sound in virtual and augmented reality. They look to expand on their work by bringing additional real-world elements into the fold.
"Historically, we have learned a great deal about our sensory systems from studies involving simple flashes and beeps," he said. "However, this and other studies are now showing that when we make our tasks more complicated in ways that better simulate the real world, new patterns of results start to emerge."