Named entity recognition in Italian lung cancer clinical reports using transformersShow others and affiliations
2023 (English)In: Proceedings - 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023, Institute of Electrical and Electronics Engineers (IEEE), 2023, p. 4101-4107Conference paper, Published paper (Refereed)
Abstract [en]
The widespread adoption of electronic health records (EHRs) offers a valuable opportunity to support clinical research by containing crucial patient information, including diagnoses, symptoms, medications, lab tests, and more. Despite the success of deep learning for biomedical Named Entity Recognition (NER), the literature in this field still presents a gap regarding applications focused on lung cancer for the Italian language. Hence, this paper presents a transformer-based approach to extract named entities from Italian clinical notes related to Non-Small Cell Lung Cancer (NSCLC). We introduce a novel set of 25 clinical entities related to NSCLC building a corpus annotated for NER. We apply a state-of the-art model pre-trained on Italian biomedical texts to the manually annotated clinical reports of a cohort of 257 patients suffering from NSCLC, successfully dealing with class-imbalance problems and obtaining promising performance (average F1-score of 84.3%). We also compared our method with two other pre-trained state-of-the-art models showing that the domain specific knowledge offered by the proposed approach is necessary to achieve higher performance. These findings also showcase the feasibility of using transformers to extract biomedical information in the Italian language.
Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2023. p. 4101-4107
Keywords [en]
deep learning, EHRs, NER, NSCLC, trasformer
National Category
Information Systems
Identifiers
URN: urn:nbn:se:umu:diva-221398DOI: 10.1109/BIBM58861.2023.10385778Scopus ID: 2-s2.0-85184904088ISBN: 9798350337488 (electronic)OAI: oai:DiVA.org:umu-221398DiVA, id: diva2:1840884
Conference
2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023, Istanbul, 5-8 december, 2023.
2024-02-272024-02-272024-02-27Bibliographically approved