P46Session 2 (Friday 12 January 2024, 09:00-11:30)The imperfect invariance problem modifies cortical signals during listening in noise
Imperfect invariances in speech pose a challenge to speech perception. When listening to speech in noise, finding regularities is hindered not only by noise but also the signal itself, as it varies due to coarticulation, underarticulation, and individual speaker characteristics. What is the minimal amount of invariance that is needed to perceive a coherent auditory object? To investigate the imperfect invariance problem in noise, we employed the stochastic figure-ground (SFG) task, which has been established as a suitable model for speech-in-noise listening. The SFG task uses random noise created from 50 ms long pure tones (“ground”) and an embedded set of tones coherently fluctuating together over time (“figure”). We modified the number of tones available at different time points in the figure. We recorded the electroencephalogram (EEG) and analyzed event-related brain potentials (ERPs) while listeners tried to detect the figure segments.
Twenty-two healthy young adults listened to SFG stimuli of 3 s duration each. Half of the trials contained a figure, and half contained a random background only. Participants indicated whether there was a figure in each trial. All stimuli consisted of pure tones of 20 discrete frequencies, selected from a larger set. Figures contained a set of 10 potential repeating frequencies. We varied in three conditions how many of these frequencies were concurrently present in the stimulus: 10/10, 7/10, or 4/10 frequencies. While the overall number of repeating frequencies remained constant throughout the figure, the actual frequencies being present (permuted from the set of 10) changed at every 50 ms of the figure. This models the imperfect invariance problem: out of a set of possible components, only a subset is present in the stimulus at each time point, the subset varying throughout the whole stimulus.
We found that the number of repeating frequencies affects the object-related negativity (ORN) and P400 ERP responses to figure segments. Figures with 10/10 coherent frequencies elicited both ERP components with the largest amplitude, followed by figures at 7/10 coherent frequencies, and yet smaller amplitudes at 4/10 coherent frequencies. Further, we found no difference between ORN and P400 when comparing 4/10 coherent frequencies with the no-figure trials. These results suggest that a 40% invariance in frequency components is insufficient to detect an auditory object in noise (fluctuating energetic masking), but a 70% invariance is sufficient. We discuss these conclusions and their significance for speech-in-noise listening in the poster presentation.