Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optimizing Distributed Tracing Overhead in a Cloud Environment with OpenTelemetry
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2024 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

To gain observability in distributed systems, some telemetry generation and gathering must be implemented. This is especially important when systems have layers of dependencies on other microservices. One method for observability is called distributed tracing. Distributed tracing is the act of building causal event chains between microservices, which are called traces. Finding bottlenecks and dependencies within each call chain is possible with the traces. One framework for implementing distributed tracing is OpenTelemetry. The developer must determine design choices when deploying OpenTelemetry in a Kubernetes cluster. For example, OpenTelemetry provides a collector that collects spans, which are parts of a trace from microservices. These collectors can be deployed one on each node, called a daemonset. Or it can be deployed with one for each service, called sidecars. This study compared the performance impact of the sidecar and daemonset setup to that of having no OpenTelemetry implemented. The resources analyzed were CPU usage, network usage, and RAM usage. Tests were done in a permutation of 4 different scenarios. Experiments were run on 4 and 2 nodes, as well as a balanced and unbalanced service placement setup. The experiments were run in a cloud environment using Kubernetes. The tested system was an emulation of one of Nasdaq's systems based on real data from the company. The study concluded that having OpenTelemetry added overhead / increased resource usage in all cases. Having the daemonset setup, compared to no OpenTelemetry, increased CPU usage by 46.5 %, network usage by 18.25 %, and memory usage by 47.5 % on average. Sidecar did, in most cases, perform worse than the daemonset setup in most cases and resources, especially in RAM and CPU usage.

Place, publisher, year, edition, pages
2024. , p. 43
Series
UMNAD ; 1467
Keywords [en]
OpenTelemetry, Cloud, Distributed tracing, Collector, Optimization, Kubernetes, tracing, Distributed systems
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-225868OAI: oai:DiVA.org:umu-225868DiVA, id: diva2:1867119
External cooperation
Nasdaq
Educational program
Master's Programme in Computing Science
Presentation
2024-05-29, MIT.A.316, Umeå, 10:45 (English)
Supervisors
Examiners
Available from: 2024-06-24 Created: 2024-06-10 Last updated: 2024-06-24Bibliographically approved

Open Access in DiVA

fulltext(4092 kB)238 downloads
File information
File name FULLTEXT01.pdfFile size 4092 kBChecksum SHA-512
97d75ad994b8714ac814135bc6b7c5713c05734bf16f4ff6191e7de738c3ae2e437178c1696ddbfedcd796c53c0a6059f4d3099e0b3d42d43cdfaab81bea5d84
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Elias, Norgren
By organisation
Department of Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 238 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 592 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf