Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Detection of Synthetic Climate Misinformation with Machine Learning Algorithms and Sentence-Level Analysis
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
2025 (Engelska)Självständigt arbete på grundnivå (kandidatexamen), 10 poäng / 15 hpStudentuppsats (Examensarbete)
Abstract [en]

The spread of climate-related misinformation can reduce public support for climate change mitigation policies. A study showed that on social media, people tend to absorb news content without knowing the details of the context. In that case, LLM can be utilised to spread misinformation, subsequently altering people's opinions for malicious purposes. To observe two machine learning algorithms: Support Vector Machine and Logistic Regression's capability to detect LLM-generated misinformation, we created a synthetic dataset, consisting of 300 examples. We have collected 150 climate-related news articles from various well-reputed sources to create the synthetic dataset. Then, we created a five to six-sentence summary based on the original article with the help of GPT-4. Each actual summary is falsified with the help of GPT-4 as well. Moreover, we evaluated each summary example from the synthetic dataset with the FineSure framework to obtain each summary's faithfulness, completeness and conciseness. The results showed that Support Vector Machine achieved an F1-score of 0.839, and Logistic Regression's F1-score was 0.787 on the synthetic dataset. We performed sentence-level analysis with the GUTEK framework on these models' false positive and negative examples. The sentence-level analysis with the GUTEK framework showed that policy-related sentences had the most impact on these models in predicting false positives. On the other hand, factual-related sentences significantly influenced these models to predict false negatives. 

Ort, förlag, år, upplaga, sidor
2025.
Serie
UMNAD ; 1576
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
URN: urn:nbn:se:umu:diva-242908OAI: oai:DiVA.org:umu-242908DiVA, id: diva2:1988038
Utbildningsprogram
Kandidatprogrammet i Datavetenskap
Examinatorer
Tillgänglig från: 2025-08-11 Skapad: 2025-08-10 Senast uppdaterad: 2025-08-11Bibliografiskt granskad

Open Access i DiVA

fulltext(2976 kB)102 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 2976 kBChecksumma SHA-512
79d718865e25d5c5c7a8ee9afdb3c79c53e54f20d7b8b8d04a91d7f305f87ac7478e352468ad12e1055c8f3330c40f5df7f95edb6986856a53439830bb4fefdf
Typ fulltextMimetyp application/pdf

Av organisationen
Institutionen för datavetenskap
Datavetenskap (datalogi)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 102 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 722 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf