Umeå University's logo

umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
A Comparative Analysis of Metadata Tools for use on Unknown Operational Datasets
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap.
2024 (engelsk)Independent thesis Advanced level (degree of Master (Two Years)), 20 poäng / 30 hpOppgave
Abstract [en]

When working with large datasets it is important that the right tools and methods are selected in order to effectively, it is important that the right tools and methods are selected in order to effectively analyze the data. This thesis presents a comparative evaluation of data management tools in the categories of validation, profiling, and feature extraction. The tools, Pandera, Ydata Profiling, SweetViz, and Tsfel, were selected and integrated into a data processing system for the WARA--Ops portal in order to validate, profile, and analyze new operational datasets uploaded to the portal. Finally, the system extracts statistical information from the dataset and uses a machine learning classification algorithm to apply a general label to the data based on the extracted information.

sted, utgiver, år, opplag, sider
2024. , s. 43
Serie
UMNAD ; 1497
HSV kategori
Identifikatorer
URN: urn:nbn:se:umu:diva-227466OAI: oai:DiVA.org:umu-227466DiVA, id: diva2:1879306
Eksternt samarbeid
Ericsson
Utdanningsprogram
Master of Science Programme in Computing Science and Engineering
Veileder
Examiner
Tilgjengelig fra: 2024-06-28 Laget: 2024-06-28 Sist oppdatert: 2025-04-01bibliografisk kontrollert

Open Access i DiVA

A Comparative Analysis of Metadata Tools for use on Unknown Operational Datasets(724 kB)4 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 724 kBChecksum SHA-512
0e4107fa01d9b334a9fe783c7a63b6b4232fa59c5b5f607fa9213157a3fbcd3fe8f3a9e151e69f42f703ccd20626fd7eff712875a473d6d7d23199039411de0c
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 4 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 178 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf