15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany 15th Speech in Noise Workshop, 11-12 January 2024, Potsdam, Germany

P50Session 2 (Friday 12 January 2024, 09:00-11:30)
Factor analysis of acoustic signals for the determination of optimal boundaries: Perspectives concerning cochlear implants and investigation of measurement variation

Olivier Crouzet, Agnieszka Duniec
Nantes Université / CNRS, France

Previous studies applied Factor Analysis on amplitude modulation from speech signals in order to estimate optimal frequency boundaries between channels. These may contribute to future improvements for specifying filterbank decomposition in cochlear implants. While some argued that 4 channels would be sufficient to represent the main segmental information, comparison of speech statistics with perceptual performance led to suggest that 6 to 7 frequency bands would be required to optimally represent vocoded speech. We applied the same approach on 2 different datasets: (a) free music recordings (Free Music Archive, https://github.com/mdeff/fma), (b) a free corpus of speech signals (Clarity Speech, doi:10.17866/rd.salford.16918180). An algorithm for the automatic computation of optimal boundary frequencies was also developed as results from the literature were based on visual judgements only.

As was expected, observed boundaries differ between speech and music: their distribution is organized differently though not homogeneously. For example, when selecting 7 modulation channels, size and direction differences vary between -3.8 and +10.7 semitones with the same determination method. Similar variation is observed either when maximal acoustic frequency is adapted for music or when it is kept constant for both conditions. Further, comparing our data on speech with results in the literature, estimated boundaries also differ. For example, determination of the optimal boundary frequencies for 4 channels gives rise to differences varying between +0.15 and +2.19 semitones. Such variation may relate to either the database content, the determination method (visual vs. automatic), or both.

Given such variability, it seemed crucial to investigate the level of variation that can be observed in conditions for which the type of signals and method are kept constant. Capitalizing on our development for the automatic estimation of boundary frequencies, we applied a procedure that is aimed at estimating variability in the measurements: concentrating on the music database, random portions of fixed duration for each music recording are extracted and the same Factor Analysis is applied. Automatic boundary determination between optimal channels is performed. For each random extract, a various sample of frequency estimates is available along with information characterizing music and technical parameters (style, compression...). This approach and the final results will be presented. These may contribute to a better analysis of the importance of "efficient coding" for channel decomposition in cochlear implants by providing fine-grained data on variation in optimal frequency boundaries.

Acknowledgements: Agnieszka Duniec received PhD funding (2019–2023) from the RFI-Ouest Industries Créatives (RFI-OIC, Région Pays de la Loire) & Nantes Université.

Last modified 2024-01-16 10:49:05