Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Personalization for robust voice pathology detection in sound waves
AI Center, FPT Software Company Limited, Viet Nam.
AI Center, FPT Software Company Limited, Viet Nam.
AI Center, FPT Software Company Limited, Viet Nam.
School of Computer Science and Information Technology, University College Cork, Ireland.
Show others and affiliations
2023 (English)In: Proceedings of the annual conference of the international speech communication association, INTERSPEECH, International Speech Communication Association , 2023, p. 1708-1712Conference paper, Published paper (Refereed)
Abstract [en]

Automatic voice pathology detection is promising for noninvasive screening and early intervention using sound signals. Nevertheless, existing methods are susceptible to covariate shifts due to background noises, human voice variations, and data selection biases leading to severe performance degradation in real-world scenarios. Hence, we propose a non-invasive framework that contrastively learns personalization from sound waves as a pre-train and predicts latent-spaced profile features through semi-supervised learning. It allows all subjects from various distributions (e.g., regionality, gender, age) to benefit from personalized predictions for robust voice pathology in a privacy-fulfilled manner. We extensively evaluate the framework on four real-world respiratory illnesses datasets, including Coswara, COUGHVID, ICBHI, and our private dataset - ASound under multiple covariate shift settings (i.e., cross-dataset), improving up to 4.12% in overall performance.

Place, publisher, year, edition, pages
International Speech Communication Association , 2023. p. 1708-1712
Series
Interspeech, ISSN 2308-457X, E-ISSN 1990-9772
Keywords [en]
covariate shift, robust voice pathology detection
National Category
Natural Language Processing Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-214779DOI: 10.21437/Interspeech.2023-1332ISI: 001186650301172Scopus ID: 2-s2.0-85171525230OAI: oai:DiVA.org:umu-214779DiVA, id: diva2:1806520
Conference
24th International Speech Communication Association, Interspeech 2023, Dublin, August 20-24, 2023
Available from: 2023-10-23 Created: 2023-10-23 Last updated: 2025-04-24Bibliographically approved

Open Access in DiVA

fulltext(1154 kB)206 downloads
File information
File name FULLTEXT01.pdfFile size 1154 kBChecksum SHA-512
480068f8c22f4086bd759f29820cec70e2a437d29414c0c84fe4d0dab8aa8250d3ce65cc6c9975e5b1713251523a6697752a9d3c0643ab0df88ea82f639d5ea4
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Vu, Xuan-Son

Search in DiVA

By author/editor
Vu, Xuan-Son
By organisation
Department of Computing Science
Natural Language ProcessingComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 206 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 515 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf