Personalization for robust voice pathology detection in sound wavesShow others and affiliations
2023 (English)In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH, International Speech Communication Association , 2023, p. 1708-1712Conference paper, Published paper (Refereed)
Abstract [en]
Automatic voice pathology detection is promising for noninvasive screening and early intervention using sound signals. Nevertheless, existing methods are susceptible to covariate shifts due to background noises, human voice variations, and data selection biases leading to severe performance degradation in real-world scenarios. Hence, we propose a non-invasive framework that contrastively learns personalization from sound waves as a pre-train and predicts latent-spaced profile features through semi-supervised learning. It allows all subjects from various distributions (e.g., regionality, gender, age) to benefit from personalized predictions for robust voice pathology in a privacy-fulfilled manner. We extensively evaluate the framework on four real-world respiratory illnesses datasets, including Coswara, COUGHVID, ICBHI, and our private dataset - ASound under multiple covariate shift settings (i.e., cross-dataset), improving up to 4.12% in overall performance.
Place, publisher, year, edition, pages
International Speech Communication Association , 2023. p. 1708-1712
Series
Interspeech, ISSN 2308-457X, E-ISSN 1990-9772
Keywords [en]
covariate shift, robust voice pathology detection
National Category
Natural Language Processing Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-214779DOI: 10.21437/Interspeech.2023-1332ISI: 001186650301172Scopus ID: 2-s2.0-85171525230OAI: oai:DiVA.org:umu-214779DiVA, id: diva2:1806520
Conference
24th International Speech Communication Association, Interspeech 2023, Dublin, August 20-24, 2023
2023-10-232023-10-232025-04-24Bibliographically approved