15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany 15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany

P28Session 2 (Friday 12 January 2024, 09:00-11:30)
Neural processing of degraded speech under divided attention: An fMRI – machine learning study

Han Wang, Patti Adank
Department of Speech, Hearing and Phonetic Sciences, University College London

Understanding spoken language often occurs under suboptimal listening conditions, such as processing degraded speech input or conversing in the presence of a competing talker. Neuroimaging studies suggest a role of frontal regions in compensating for degraded speech (Erb et al., 2013, doi:10.1523/JNEUROSCI.4596-12.2013) and that of cingulo-opercular attentional network in allocating resources between different tasks under distraction (Gennari et al., 2018, doi:10.1016/j.neuroimage.2018.06.035). However, no studies so far have revealed the combined effects of acoustic degradation and distraction on the speech processing network.

Using functional magnetic resonance imaging (fMRI) and machine learning (ML), we investigated the neural basis of processing degraded speech under divided attention. We examined brain responses of listeners (N=25) performing a sentence recognition task (4- and 8-band noise-vocoding) concurrently with a visuospatial task with two difficulty levels. Traditional general linear modelling (GLM) based fMRI-analysis revealed intelligibility-related responses in the frontal and cingulate cortices and bilateral insulae but failed to detect neural correlates of visual-task difficulty. Using gradient tree boosting algorithms (Chen & Guestrin, 2016, doi:10.1145/2939672.2939785), we predicted task conditions from brain responses with high accuracy (60%) and significantly surpassing chance level (25%). Importantly, the algorithm further identified more elevated response in the ventral visual pathway (right inferior frontal gyrus) and right insula when the visual task imposed a high (compared to low) demand. Moreover, regions found sensitive to speech intelligibility including left supplementary motor and right middle frontal regions showed alleviated responses under a hard visual task, insinuating dynamic resource dispensing across the two tasks.

Our results unveiled the engagement of the frontal-temporal network in the processing of degraded speech under divided attention. These results suggest that ML can detect spatially complex and subtle non-linear neural activation patterns that are otherwise hidden by inferential statistics. The tree-based approach offers a robust and generalisable solution to predictions using fMRI data, whose sample size is often subject to external constraints.

Last modified 2024-01-16 10:49:05