Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Robust procedural learning for anomaly detection and observability in 5G RAN
Umeå University, Faculty of Science and Technology, Department of Computing Science. (ADS LAB)ORCID iD: 0000-0001-9013-6603
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-9842-7840
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-2633-6798
2024 (English)In: IEEE Transactions on Network and Service Management, E-ISSN 1932-4537, Vol. 21, no 2, p. 1432-1445Article in journal (Refereed) Published
Abstract [en]

Most existing large distributed systems have poor observability and cannot use the full potential of machine learning-based behavior analysis. The system logs, which contain the primary source of information, are unstructured and lack the context needed to track procedures and learn the system’s behavior. This work presents a new trace guideline that enables a component-and procedure-based split of the system logs for the future 5G Radio Access Network (RAN). As the system can be broken into smaller pieces, models can more accurately learn the system’s behavior and use the context to improve anomaly detection and observability. The evaluation result is astonishing; where previously state-of-the-art methods struggle to learn the behavior, a fast, dictionary-based algorithm can detect all anomalies and keep false positives close to zero. Troubleshooters can also more quickly identify anomalies and gain useful insights into the component interaction in RAN.

Place, publisher, year, edition, pages
IEEE, 2024. Vol. 21, no 2, p. 1432-1445
Keywords [en]
observability, trace guidelines, anomaly detection, Radio Access Network, 5G
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-206058DOI: 10.1109/TNSM.2023.3321401ISI: 001205268100057Scopus ID: 2-s2.0-85174831102OAI: oai:DiVA.org:umu-206058DiVA, id: diva2:1746124
Funder
Knut and Alice Wallenberg Foundation
Note

Originally included in thesis in manuscript form.

Available from: 2023-03-27 Created: 2023-03-27 Last updated: 2025-04-24Bibliographically approved
In thesis
1. Machine learning-based diagnostics and observability in mobile networks
Open this publication in new window or tab >>Machine learning-based diagnostics and observability in mobile networks
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Maskininlärningsbaserad diagnostik och observerbarhet i mobila nätverk
Abstract [en]

To meet the high-performance and reliability demands of 5G, the Radio Access Network (RAN) is moving to a cloud-native architecture. The new microservice architecture promises increased operational efficiency and a shorter time-to-market, but it also comes with a price. The new distributed and virtualized architecture is far more complex than ever before, and with the increasing number of features it brings, troubleshooting becomes more difficult. So far, RAN troubleshooters have relied on their expertise to analyze systems manually, but the ever-growing data and increased complexity make it challenging to grasp system behavior.

This thesis contributes threefold, where the proposed machine learning and statistical methods help RAN troubleshooters find deviations in system logs, identify the root cause of these deviations, and improve the system's observability. These methods learn the application's behavior from the system logs events and can identify behavior deviations from many different aspects. The thesis also demonstrates how observability can be improved by using a new software instrumentation guideline. The guideline enables the tracking of systemized procedures and enhances system understanding. The purpose of the guideline is to make RAN developers aware that machine learning can utilize debug information and help their troubleshooting process. To familiarize the reader with the research area, the challenges, and methods that can be used to detect anomalies, perform root cause analysis and observe RAN system behavior. The proposed research methods are integrated and tested in an advanced 5G test bed to evaluate the methods' accuracy, speed, system impact, and implementation cost.

The results demonstrate the advantage of using machine learning and statistical methods when troubleshooting the behavior of RAN. Machine learning methods, similar to those presented in this thesis, may help those who troubleshoot RAN and accelerate the development of 5G. The thesis ends with presenting potential research areas where this research could be further developed and applied, both in RAN and other systems.

Abstract [sv]

För att möta de höga kraven på prestanda och tillförlitlighet i det nya mobila 5G nätet sker nu en övergång till en molnbaserad arkitektur i radioaccessnätverket (RAN). Den nya mikrotjänstarkitekturen är tänkt att öka skalbarheten, prestandan och korta ner ledtiderna för produktleveranserna. Den distribuerade och virtuella arkitekturen är däremot mer komplicerad än tidigare och medför att det blir svårare att felsöka. Hittills har de som felsökt RAN förlitat sig på sin expertis för att manuellt analysera systemet. Men den ständigt växande datamängden och den ökade komplexiteten gör det svårt att förstå systemets beteende.

Denna avhandling bidrar med kunskap inom tre närliggande områden, där de föreslagna maskininlärnings- och statistiska metoderna hjälper de som felsöker RAN att hitta avvikelser i systemloggar, hjälper till att identifiera grundorsaken till dessa avvikelser och förbättrar systemets observerbarhet. Dessa metoder lär sig RANs beteende utifrån händelser i systemloggar och kan identifiera ett antal beteendeavvikelser. Avhandlingen visar också på hur observerbarheten kan förbättras genom att använda en ny riktlinje för mjukvaruinstrumentering. Riktlinjen gör det möjligt att följa hur RANs applikationer påverkar varandra vilket i sin tur förbättrar systemförståelsen. Syftet med riktlinjerna är att göra dem som arbetar med RAN medvetna om hur maskininlärning kan hjälpa till i deras felsökningsprocess. För att bekanta läsaren med forskningsområdet diskuteras först utmaningarna och metoderna som kan användas för att upptäcka avvikelser i RAN data, orsaken till avvikelserna samt hur observerbarheten av systemet kan förbättras. För att utvärdera de föreslagna metodernas noggrannhet, hastighet, systempåverkan och implementeringskostnad, integrerar och testas metoderna i en avancerad 5G-testbädd.

Resultatet visar på de stora fördelarna med att använda maskininlärning och statistiska metoder vid felsökning av beteendet hos RAN. Maskininlärningsmetoder, liknande de som presenteras i denna avhandling, kan komma att hjälpa dem som felsöker RAN och påskynda utvecklingen av 5G. Avhandlingen avslutas med en presentation av potentiella forskningsområden där forskningen i denna avhandling skulle kunna vidareutvecklas och tillämpas, både i RAN men även i andra system.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2023. p. 45
Series
Report / UMINF, ISSN 0348-0542 ; 23.02
Keywords
Anomaly detection, Root cause analysis, Observability, Machine learning, Radio Access Network, 5G
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-206055 (URN)978-91-8070-053-5 (ISBN)978-91-8070-054-2 (ISBN)
Public defence
2023-04-21, Aula Biologica BIO.E.203, Umeå, 09:15 (English)
Opponent
Supervisors
Available from: 2023-03-31 Created: 2023-03-27 Last updated: 2024-07-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Sundqvist, TobiasBhuyan, Monowar H.Elmroth, Erik

Search in DiVA

By author/editor
Sundqvist, TobiasBhuyan, Monowar H.Elmroth, Erik
By organisation
Department of Computing Science
In the same journal
IEEE Transactions on Network and Service Management
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 1095 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf