Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Nylén, Fredrik, DocentORCID iD iconorcid.org/0000-0003-3373-0934
Alternative names
Publications (10 of 59) Show all publications
Nylén, F. (2025). An acoustic model of speech dysprosody in patients with Parkinson's disease. Frontiers in Human Neuroscience, 9, Article ID 1566274.
Open this publication in new window or tab >>An acoustic model of speech dysprosody in patients with Parkinson's disease
2025 (English)In: Frontiers in Human Neuroscience, E-ISSN 1662-5161, Vol. 9, article id 1566274Article in journal (Refereed) Published
Abstract [en]

Purpose: This study aimed to determine the acoustic properties most indicative of dysprosody severity in patients with Parkinson's disease using an automated acoustic assessment procedure.

Method: A total of 108 read speech recordings of 68 speakers with PD (45 male, 23 female, aged 65.0 ± 9.8 years) were made with active levodopa treatment. A total of 40 of the patients were additionally recorded without levodopa treatment to increase the range of dysprosody severity in the sample. Four human clinical experts independently assessed the patients' recordings in terms of dysprosody severity. Separately, a speech processing pipeline extracted the acoustic properties of prosodic relevance from automatically identified portions of speech used as utterance proxies. Five machine learning models were trained on 75% of speech portions and the perceptual evaluations of the speaker's dysprosody severity in a 10-fold cross-validation procedure. They were evaluated regarding their ability to predict the perceptual assessments of recordings excluded during training. The models' performances were assessed by their ability to accurately predict clinical experts' dysprosody severity assessments.

Results: The acoustic predictors of importance spanned several acoustic domains of prosodic relevance, with the variability in fo change between intonational turning points and the average first Mel-frequency cepstral coefficient at these points being the two top predictors. While predominant in the literature, variability in utterance-wide fo was found to be only the fifth strongest predictor.

Conclusion: Human expert raters' assessments of dysprosody can be approximated by the automated procedure, affording application in clinical settings where an experienced expert is unavailable. Variability in pitch does not adequately describe the level of dysprosody due to Parkinson's disease.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2025
Keywords
automatic acoustic assessment, dysprosody, Parkinson’s disease, dysarthria, prosody
National Category
Natural Language Processing Other Medical Sciences not elsewhere specified
Research subject
Neurology; Neurology
Identifiers
urn:nbn:se:umu:diva-238197 (URN)10.3389/fnhum.2025.1566274 (DOI)
Available from: 2025-04-28 Created: 2025-04-28 Last updated: 2025-04-28Bibliographically approved
Vouzouneraki, K., Karlsson, F., Holmberg, J., Olsson, T., Berinder, K., Höybye, C., . . . Dahlqvist, P. (2025). Digital voice analysis as a biomarker of acromegaly. Journal of Clinical Endocrinology and Metabolism, 110(4), 983-990, Article ID dgae689.
Open this publication in new window or tab >>Digital voice analysis as a biomarker of acromegaly
Show others...
2025 (English)In: Journal of Clinical Endocrinology and Metabolism, ISSN 0021-972X, E-ISSN 1945-7197, Vol. 110, no 4, p. 983-990, article id dgae689Article in journal (Refereed) Published
Abstract [en]

Context: There is a considerable diagnostic delay in acromegaly, contributing to increased morbidity. Voice changes due to orofacial and laryngeal changes are common in acromegaly.

Objective: Our aim was to explore the use of digital voice analysis as a biomarker for acromegaly using broad acoustic analysis and machine learning.

Methods: Voice recordings from patients with acromegaly and matched controls were collected using a mobile phone at Swedish university hospitals. Anthropometric and clinical data and the Voice Handicap Index (VHI) were assessed. Digital voice analysis of a sustained and stable vowel [a] resulted in 3274 parameters, which were used for training of machine learning models classifying the speaker as “acromegaly” or “control.” The machine learning models were trained with 76% of the data and the remaining 24% was used to assess their performance. For comparison, voice recordings of 50 pairs of participants were assessed by 12 experienced endocrinologists.

Results: We included 151 Swedish patients with acromegaly (13% biochemically active and 10% newly diagnosed) and 139 matched controls. The machine learning model identified patients with acromegaly more accurately (area under the receiver operating curve [ROC AUC] 0.84) than experienced endocrinologists (ROC AUC 0.69). Self-reported voice problems were more pronounced in patients with acromegaly than matched controls (median VHI 6 vs 2, P < .01) with higher prevalence of clinically significant voice handicap (VHI ≥20: 22.5% vs 3.6%).

Conclusion: Digital voice analysis can identify patients with acromegaly from short voice recordings with high accuracy. Patients with acromegaly experience more voice disorders than matched controls.

Place, publisher, year, edition, pages
Oxford University Press, 2025
Keywords
Voice Handicap Index, acromegaly, digital voice analysis, machine learning
National Category
Endocrinology and Diabetes
Research subject
computational linguistics; computational linguistics
Identifiers
urn:nbn:se:umu:diva-231262 (URN)10.1210/clinem/dgae689 (DOI)001341029100001 ()39363748 (PubMedID)2-s2.0-105000481113 (Scopus ID)
Funder
Swedish Research Council, 2018-2024Swedish Research Council, 2017-00626Swedish Association of Local Authorities and RegionsThe Kempe Foundations
Available from: 2024-10-30 Created: 2024-10-30 Last updated: 2025-04-28Bibliographically approved
Nylén, F., Skotare, T. & von Boer, J. (2025). The Visible Speech (VISP) platform: a secure infrastructure for the study of speech acts and spoken conversations. In: DHNB 2025: 9th Conference on Digital Humanities in the Nordic and Baltic Countries. Paper presented at 9th Conference on Digital Humanities in the Nordic and Baltic Countries (DHNB 2025), Tartu, Estonia, March 5-7, 2025.
Open this publication in new window or tab >>The Visible Speech (VISP) platform: a secure infrastructure for the study of speech acts and spoken conversations
2025 (English)In: DHNB 2025: 9th Conference on Digital Humanities in the Nordic and Baltic Countries, 2025Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Spoken language is central to many human interactions and provides the medium through which activities and events across many humanities and social sciences disciplines are studied. It is also the object of active study in itself. As central to humanity as spoken language is, regulations aimed at mitigating privacy concerns also affect the affordance for collaborations on a national or larger scale based on spoken materials.  

The Visible Speech (VISP) platform is a web-based research infrastructure at Humlab, Umeå University, designed to handle audio recordings of speech in compliance with the national implementation of GDPR and security requirements. VISP provides a centralized environment for research of all disciplines in which recordings of spoken language constitute the primary material, meeting both researchers’ needs for efficient workflows and legislators’ demands for secure data management.

One of VISP's primary advantages is its ability to facilitate research on audio recordings that now constitute personally identifiable information (PII) under the application of the GDPR in Sweden. These recordings may contain sensitive content or have been made in sensitive contexts, classifying them as sensitive PII under national legislation. Sensitive contents may occur in relation to, for instance, the ethnicity and religious beliefs of the speaker, and sensitive contexts may occur when the recording is made in a healthcare context or in a context where a person’s membership with a union organization is divulged. While the challenges in conducting larger research efforts on the types of materials are currently aggravated by the implementation of the GDPR locally in Sweden, it is currently not clear to what extent upcoming AI regulation will, in effect, migrate identical or similar constraints to research in other countries in the EU as well.

The VISP platform offers a unified environment for storage, controlled access, direct work, and reproducible speech signal processing. VISP is built on the foundation of earlier research efforts1,2 and includes a comprehensive set of speech and voice analysis procedures within one framework. Thus, national research groups can collectively store interviews or other spoken language recordings, have automatic transcriptions or other speech processing performed, and access the results for complementary manual annotation or analysis simultaneously and securely. Additionally, VISP facilitates the digital archiving of projects through a uniform, documented, and transparent directory structure, reducing barriers to making data available following the FAIR principles. Research projects dealing with sensitive personal data in audio recording form require review by the Ethical Approval Authority and may subsequently take advantage of the VISP facilities.

A significant feature of VISP is its integration with the Swedish Academic Identity Federation (SWAMID, connected to eduGAIN), which enables researchers across Sweden to have secure, federated logins. This national federated login system allows researchers to access project data and collaborate on material processing in ways that were previously not possible. Moreover, VISP supports projects by lowering the step in to digital signal processing and audio analysis of the collected audio signals. This capability allows researchers to perform hands-on processing and analysis without the risk of disseminating sensitive audio recordings. By leveraging SWAMID, VISP ensures that researchers can work seamlessly and securely on collected materials, enhancing collaborative efforts and data handling efficiency. By providing tools for direct manipulation and examination of audio data, VISP ensures that all stages of data handling, from collection to analysis, are conducted within a secure environment, thereby maintaining the integrity and confidentiality of sensitive information.

The work conducted within VISP is part of SweCLARIN, the Swedish node of the European Research Infrastructure Consortium (ERIC) CLARIN. SweCLARIN aims to develop and provide national and European infrastructure for speech and text-based e-science, offering extensive digitized materials and advanced language technology tools. By combining the advanced technologies developed by CLARIN ERIC partners1 with stringent security protocols and leveraging federated login systems, VISP enables efficient and secure research on audio recordings of speech. The VISP components are available for download and setup of local instances and for modification, and the framework, therefore, promises to provide an invaluable tool for researchers, facilitating unprecedented collaboration and data processing within the digital humanities on both a national and larger scale.

References:

[1] R. Winkelmann, J. Harrington, K. Jänsch, EMU-SDMS: Advanced speech database management and analysis in R, Comput. Speech Lang. 45 (2017) 392–410.

[2] R. Winkelmann, J. Harrington, EMU-SDMS: R Centric Semi-automatic Speech Database Processing and Analysis. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.) Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019 (pp. 1317--1321).  Canberra, Australia: Australasian Speech Science and Technology Association Inc

National Category
Languages and Literature Other Humanities
Research subject
digital humanities
Identifiers
urn:nbn:se:umu:diva-235877 (URN)
Conference
9th Conference on Digital Humanities in the Nordic and Baltic Countries (DHNB 2025), Tartu, Estonia, March 5-7, 2025
Funder
Swedish Research Council, 2023-00161-16
Available from: 2025-02-24 Created: 2025-02-24 Last updated: 2025-02-24Bibliographically approved
Nylén, F., Holmberg, J. & Södersten, M. (2024). Acoustic cues to femininity and masculinity in spontaneous speech. Journal of the Acoustical Society of America, 155(5), 3090-3100
Open this publication in new window or tab >>Acoustic cues to femininity and masculinity in spontaneous speech
2024 (English)In: Journal of the Acoustical Society of America, ISSN 0001-4966, E-ISSN 1520-8524, Vol. 155, no 5, p. 3090-3100Article in journal (Refereed) Published
Abstract [en]

The perceived level of femininity and masculinity is a prominent property by which a speaker’s voice is indexed, and a vocal expression incongruent with the speaker’s gender identity can greatly contribute to gender dysphoria. Our understanding of the acoustic cues to the levels of masculinity and femininity perceived by listeners in voices is not well developed, and an increased understanding of them would benefit communication of therapy goals and evaluation in gender-affirming voice training. We developed a voice bank with 132 voices with a range of levels of femininity and masculinity expressed in the voice, as rated by 121 listeners in independent, individually randomized perceptual evaluations. Acoustic models were developed from measures identified as markers of femininity or masculinity in the literature using penalized regression and tenfold cross-validation procedures. The 223 most important acoustic cues explained 89% and 87% of the variance in the perceived level of femininity and masculinity in the evaluation set, respectively. The median fo was confirmed to provide the primary cue, but other acoustic properties must be considered in accurate models of femininity and masculinity perception. The developed models are proposed to afford communication and evaluation of gender-affirming voice training goals and improve voice synthesis efforts.

Place, publisher, year, edition, pages
American Institute of Physics (AIP), 2024
National Category
Other Medical Sciences not elsewhere specified
Research subject
Oto-Rhino-Laryngology; language studies; Physics
Identifiers
urn:nbn:se:umu:diva-224137 (URN)10.1121/10.0025932 (DOI)001215860700004 ()38717212 (PubMedID)2-s2.0-85192627110 (Scopus ID)
Available from: 2024-05-08 Created: 2024-05-08 Last updated: 2025-01-07Bibliographically approved
Brage, L., Karlsson, F., Hägglund, P. & Holmlund, T. (2024). eTWST: an extension to the timed water swallow test for increased dysphagia screening accuracy. Dysphagia (New York. Print)
Open this publication in new window or tab >>eTWST: an extension to the timed water swallow test for increased dysphagia screening accuracy
2024 (English)In: Dysphagia (New York. Print), ISSN 0179-051X, E-ISSN 1432-0460Article in journal (Refereed) Epub ahead of print
Abstract [en]

We aimed to fine-tuning the Timed Water Swallow Test (TWST) screening procedure to provide the most reliable prediction of the Flexible Endoscopic Evaluation of Swallowing (FEES) assessment outcomes, with age, sex, and the presence of clinical signs of dysphagia being considered in the assessment. Participants were healthy people and patients with suspected dysphagia. TWST performance and participants' reported dysphagia symptoms were assessed in terms of their utility in predicting the outcome of a FEES assessment the same day. The FEES assessors were blinded to the nature of the TWST performance. The water swallowing capacity levels and clinical observations during a screening performance that were indicative of dysphagia/no symptoms in FEES were determined. Convergent validity was assessed as the agreement with the Functional Oral Intake Scale (FOIS) in the FEES assessment. TWST predicted FEES findings (aspiration and dysphagia) with a sensitivity of 72 and 45% and a specificity of 75% and 80%, respectively. Extended analysis of the TWST procedure (eTWST) identified aspiration (sensitivity = 92%, specificity = 62%) and dysphagia (sensitivity = 70%, and specificity = 72%) more accurately and showed a high correlation with FOIS (ɸ = 0.37). Excellent inter-rater reliability was further observed (Kw = 0.83). The extended evaluation of TWST performance has superior criterion validity to that of TWST. eTWST displayed high convergent validity and excellent interrater reliability. We therefore believe that eTWST can be highly relevant for clinical dysphagia screening.

Place, publisher, year, edition, pages
Springer Science+Business Media B.V., 2024
National Category
Otorhinolaryngology
Research subject
Oto-Rhino-Laryngology
Identifiers
urn:nbn:se:umu:diva-231621 (URN)10.1007/s00455-024-10778-z (DOI)001350450500001 ()39521747 (PubMedID)2-s2.0-85208779124 (Scopus ID)
Funder
Region Västerbotten, 7003394
Available from: 2024-11-10 Created: 2024-11-10 Last updated: 2024-11-19
Holmberg, J., Södersten, M., Linander, I. & Karlsson, F. (2024). Perception of femininity and masculinity in voices as rated by transgender and gender diverse people, professional speech and language pathologists, and cisgender naive listeners. Journal of Voice
Open this publication in new window or tab >>Perception of femininity and masculinity in voices as rated by transgender and gender diverse people, professional speech and language pathologists, and cisgender naive listeners
2024 (English)In: Journal of Voice, ISSN 0892-1997, E-ISSN 1873-4588Article in journal (Refereed) Epub ahead of print
Abstract [en]

Objective: To explore whether cisgender naive listeners, transgender and gender diverse (TGD) listeners, and speech-language pathologists (SLPs) experienced in providing gender-affirming voice training differ in their perception of femininity and masculinity in voices.

Methods: Samples of spontaneous speech were collected from 95 cisgender, and 37 TGD speakers. Three listener groups of cisgender naive (N = 77), TGD (N = 30), and SLP (N = 14) listeners, respectively, rated the voices on visual analog scales in two randomly ordered blocks, in which the perceived degree of femininity was rated separately from the perceived degree of masculinity.

Results: The three listener groups showed similar patterns in their distribution of ratings on the femininity and masculinity scales. The TGD listeners’ mean ratings did not differ from the cisgender naive listeners’, whereas SLPs showed a small, but significant, difference in their ratings compared with both TGD and cisgender naive listeners and rated the voices lower on both the femininity and masculinity scales.

Conclusion: The results differ from previous studies as TGD, and cisgender naive listeners rated the voices very similarly. The lower ratings of femininity and masculinity by the SLPs were likely influenced by their awareness of the complexity in the perception of voices. Therefore, SLPs providing gender-affirming voice training should be attentive to how their professional training may influence their perception of femininity and masculinity in voices and encourage discussions and explorations of the TGD voice client's perceptions of voices.

Place, publisher, year, edition, pages
Elsevier, 2024
Keywords
Femininity perception, Gender diverse, Masculinity perception, SLP, Transgender, Voice
National Category
Gender Studies
Identifiers
urn:nbn:se:umu:diva-229333 (URN)10.1016/j.jvoice.2024.07.034 (DOI)39179471 (PubMedID)2-s2.0-85202476085 (Scopus ID)
Available from: 2024-09-13 Created: 2024-09-13 Last updated: 2025-03-26
Holmgren, E., Spyckerelle, I., Hultin, M., Karlsson, F., Ottander, U., Sahlin, C., . . . Franklin, K. A. (2024). Reading aloud compared with positive expiratory pressure after abdominal surgery: a randomized controlled trial. International Journal of Surgery. Global Health, 7(6), Article ID e00487.
Open this publication in new window or tab >>Reading aloud compared with positive expiratory pressure after abdominal surgery: a randomized controlled trial
Show others...
2024 (English)In: International Journal of Surgery. Global Health, ISSN 2576-3342, Vol. 7, no 6, article id e00487Article in journal (Refereed) Published
Abstract [en]

Background: Without evidence, positive expiratory pressure therapy is a part of rehabilitation worldwide to prevent postoperative hypoxia. Reading aloud could be used as an alternative therapy as lung volumes increases while speaking. We aimed to investigate whether reading aloud is superior to positive expiratory pressure therapy for improving oxygen saturation after abdominal surgery.

Material and Methods: This crossover randomized controlled trial compared reading a text aloud with positive expiratory pressure therapy in patients on postoperative day 1 or 2 after upper gastrointestinal, colorectal, urological, or gynecological abdominal surgery at Umeå University Hospital, Sweden. The primary outcome was the change in peripheral oxygen saturation compared with baseline at 7 min after the intervention. The secondary outcome was transcutaneous carbon dioxide partial pressure change.

Results: This study included 50 patients of which 48 patients were analyzed. Peripheral oxygen saturation rapidly decreased to minimum values below baseline immediately after both interventions and then increased to values above baseline after reading aloud (1%, 95% confidence interval 0.2% to 1%, P = 0.004), but not after positive expiratory pressure therapy (−0.2%, 95% confidence interval −1% to 0.4%, P = 0.436). The difference in oxygen saturation was 1% (95% confidence interval 0.1% to 2%, P = 0.039) at 7 min after termination of the interventions. The interventions reduced transcutaneous carbon dioxide partial pressure by similar amounts.

Conclusions: This trial adds to the evidence against the use of positive expiratory pressure therapy after abdominal surgery. It is even slightly better to read aloud.

Place, publisher, year, edition, pages
Wolters Kluwer, 2024
Keywords
abdominal surgery, positive expiratory pressure, postoperative hypoxia, postoperative pulmonary complications, speaking aloud
National Category
Surgery
Research subject
Surgery
Identifiers
urn:nbn:se:umu:diva-231261 (URN)10.1097/GH9.0000000000000487 (DOI)
Funder
Swedish Heart Lung Foundation
Available from: 2024-10-30 Created: 2024-10-30 Last updated: 2025-04-10Bibliographically approved
Karlsson, F., Lovric, L., Matthelié, J., Brage, L. & Hägglund, P. (2023). A within-subject comparison of face-to-face and telemedicine screening using the timed water swallow test (TWST) and the test of mastication and swallowing of solids (TOMASS). Dysphagia (New York. Print), 38, 483-490
Open this publication in new window or tab >>A within-subject comparison of face-to-face and telemedicine screening using the timed water swallow test (TWST) and the test of mastication and swallowing of solids (TOMASS)
Show others...
2023 (English)In: Dysphagia (New York. Print), ISSN 0179-051X, E-ISSN 1432-0460, Vol. 38, p. 483-490Article in journal (Refereed) Published
Abstract [en]

The Timed Water Swallow Test (TWST) and the Test of Mastication of Solids (TOMASS) are dysphagia screening procedures that have been shown to be reliably assessed from video. The reliability of the procedures performed over telemedicine has not previously been assessed. TWST and TOMASS outcomes in two situations (both face-to-face and over telemedicine) were compared for 48 participants (aged 60-90; 27 with clinical conditions and 21 older persons). Both testing situation and test performed order were randomized, and all assessment procedures were performed within 3 h of each other. The results indicated a high level of agreement between face-to-face and telemedicine screening outcomes for TWST and TOMASS, respectively. The assessments indicated an 83% and 76% agreement in classifications of individual participants as within or outside normal limits for the TWST and TOMASS for the two test situations. The TWST showed a balanced distribution in differing classification in telemedicine (0.16-0.19 error rates). The TOMASS procedure classified more participants as outside normal limits over telemedicine compared to face-to-face administration. Agreement in the observed number of swallows was substantially lower than other outcome measures, which is attributed to increased difficulty in observing this property over video. Most participants (60%) reported that they would prefer telemedicine over face-to-face assessments, and 90% viewed the procedure as more accessible than expected. All participants were satisfied with the telemedicine procedures. The results suggest that clinical assessment of dysphagia over telemedicine using the TWST and TOMASS are viable alternatives to face-to-face administration of the procedures.

Place, publisher, year, edition, pages
Springer, 2023
Keywords
Comparison of administration situations, Dysphagia screening, Telemedicine
National Category
Other Medical Sciences not elsewhere specified
Research subject
Oto-Rhino-Laryngology
Identifiers
urn:nbn:se:umu:diva-197988 (URN)10.1007/s00455-022-10490-w (DOI)000822477700004 ()35809097 (PubMedID)2-s2.0-85133601458 (Scopus ID)
Available from: 2022-07-10 Created: 2022-07-10 Last updated: 2023-06-19Bibliographically approved
Mirkoska, V., Antonsson, M., Hartelius, L. & Karlsson, F. (2023). Detection of subclinical motor speech deficits after presumed low-grade glioma surgery. Brain Sciences, 13(12), Article ID 1631.
Open this publication in new window or tab >>Detection of subclinical motor speech deficits after presumed low-grade glioma surgery
2023 (English)In: Brain Sciences, E-ISSN 2076-3425, Vol. 13, no 12, article id 1631Article in journal (Refereed) Published
Abstract [en]

Motor speech performance was compared before and after surgical resection of presumed low-grade gliomas. This pre- and post-surgery study was conducted on 15 patients (mean age = 41) with low-grade glioma classified based on anatomic features. Repetitions of /pa/, /ta/, /ka/, and /pataka/ recorded before and 3 months after surgery were analyzed regarding rate and regularity. A significant reduction (6 to 5.6 syllables/s) pre- vs. post-surgery was found in the rate for /ka/, which is comparable to the approximate average decline over 10–15 years of natural aging reported previously. For all other syllable types, rates were within normal age-adjusted ranges in both preoperative and postoperative sessions. The decline in /ka/ rate might reflect a subtle reduction in motor speech production, but the effects were not severe. All but one patient continued to perform within normal ranges post-surgery; one performed two standard deviations below age-appropriate norms pre- and post-surgery in all syllable tasks. The patient experienced motor speech difficulties, which may be related to the tumor’s location in an area important for speech. Low-grade glioma may reduce maximum speech-motor performance in individual patients, but larger samples are needed to elucidate how often the effect occurs.

Place, publisher, year, edition, pages
MDPI, 2023
Keywords
low-grade glioma, motor speech, diadochokinesis, acoustic analysis
National Category
Other Medical Sciences not elsewhere specified Otorhinolaryngology Surgery
Identifiers
urn:nbn:se:umu:diva-217042 (URN)10.3390/brainsci13121631 (DOI)001130474400001 ()2-s2.0-85180512398 (Scopus ID)
Funder
Umeå University
Available from: 2023-11-24 Created: 2023-11-24 Last updated: 2025-04-24Bibliographically approved
Holmberg, J., Linander, I., Södersten, M. & Karlsson, F. (2023). Exploring motives and perceived barriers for voice modification: the views of transgender and gender-diverse voice clients. Journal of Speech, Language and Hearing Research, 66(7), 2246-2259
Open this publication in new window or tab >>Exploring motives and perceived barriers for voice modification: the views of transgender and gender-diverse voice clients
2023 (English)In: Journal of Speech, Language and Hearing Research, ISSN 1092-4388, E-ISSN 1558-9102, Vol. 66, no 7, p. 2246-2259Article in journal (Refereed) Published
Abstract [en]

Purpose: To date, transgender and gender-diverse voice clients' perceptions and individual goals have been missing in discussions and research on gender-affirming voice therapy. Little is, therefore, known about the client's expectations of therapy outcomes and how these are met by treatments developed from views of vocal gender as perceived by cisgender persons. This study aimed to explore clients' individual motives and perceived barriers to undertaking gender-affirming voice therapy.

Method: Individual, semistructured interviews with 15 transgender and gender-diverse voice clients considering voice therapy were conducted and explored using qualitative content analysis.

Results: Three themes were identified during the analysis of the participants' narratives. In the first theme, “the incongruent voice setting the rules,” the contribution of the voice on the experienced gender dysphoria is put in focus. The second theme, “to reach a voice of my own choice,” centers around anticipated personal gains using a modified voice. The third theme, “a voice out of reach,” relates to worries and restricting factors for not being able to reach one's set goals for voice modification.

Conclusions: The interviews clearly indicate a need for a person-centered voice therapy that starts from the individuals' expressed motives for modifying the voice yet also are affirmative of anticipated difficulties related to voice modification. We recommend that these themes should form the basis of the pretherapy joint discussion between the voice client and the speech-language pathologist to ensure therapy goals that are realistic and relevant to the client.

Place, publisher, year, edition, pages
American Speech Language Hearing Association, 2023
Keywords
transgender, gender-affirming, voice dysphoria, voice therapy, voice modification, person-centered therapy, qualitative methodology
National Category
Clinical Medicine
Identifiers
urn:nbn:se:umu:diva-209215 (URN)10.1044/2023_jslhr-23-00042 (DOI)001041295400005 ()37263019 (PubMedID)2-s2.0-85164624938 (Scopus ID)
Available from: 2023-06-07 Created: 2023-06-07 Last updated: 2025-01-08Bibliographically approved
Projects
Effects of deep brain stimulation treatment of Parkinson´s decease on speech and articulation proficiency: A longitudinal and contrastive study. [2009-00946_VR]; Umeå UniversityIntonation and rhythm in speech of patients with Parkinson´s disease - a longitudinal investigation of effects of the disease and its treatment [2011-02294_VR]; Umeå University
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-3373-0934

Search in DiVA

Show all publications