umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Performance problem diagnosis in cloud infrastructures
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Distributed Systems)
2016 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Cloud datacenters comprise hundreds or thousands of disparate application services, each having stringent performance and availability requirements, sharing a finite set of heterogeneous hardware and software resources. The implication of such complex environment is that the occurrence of performance problems, such as slow application response and unplanned downtimes, has become a norm rather than exception resulting in decreased revenue, damaged reputation, and huge human-effort in diagnosis. Though causes can be as varied as application issues (e.g. bugs), machine-level failures (e.g. faulty server), and operator errors (e.g. mis-configurations), recent studies have attributed capacity-related issues, such as resource shortage and contention, as the cause of most performance problems on the Internet today. As cloud datacenters become increasingly autonomous there is need for automated performance diagnosis systems that can adapt their operation to reflect the changing workload and topology in the infrastructure. In particular, such systems should be able to detect anomalous performance events, uncover manifestations of capacity bottlenecks, localize actual root-cause(s), and possibly suggest or actuate corrections.

This thesis investigates approaches for diagnosing performance problems in cloud infrastructures. We present the outcome of an extensive survey of existing research contributions addressing performance diagnosis in diverse systems domains. We also present models and algorithms for detecting anomalies in real-time application performance and identification of anomalous datacenter resources based on operational metrics and spatial dependency across datacenter components. Empirical evaluations of our approaches shows how they can be used to improve end-user experience, service assurance and support root-cause analysis. 

Place, publisher, year, edition, pages
Umeå: Department of Computing Science, Umeå University , 2016. , 28 p.
Series
Report / UMINF, ISSN 0348-0542 ; 16.14
Keyword [en]
Systems Performance, Performance anomalies, Performance bottlenecks, Cloud infrastructures, Cloud Computing, Cloud Services, Cloud Computing Performance, Performance problems, Performance anomaly detection, Performance bottleneck identification, Performance Root-cause Analysis
National Category
Computer Systems
Research subject
Computer Systems; Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-120287ISBN: 978-91-7601-500-1 (print)OAI: oai:DiVA.org:umu-120287DiVA: diva2:928037
Presentation
2016-05-24, N430, Naturvetarhuset, Umeå University, Umeå, 10:00 (English)
Opponent
Supervisors
Projects
Cloud Control (C0590801)
Funder
Swedish Research Council, C0590801
Available from: 2016-05-23 Created: 2016-05-13 Last updated: 2016-08-23Bibliographically approved
List of papers
1. Performance Anomaly Detection and Bottleneck Identification
Open this publication in new window or tab >>Performance Anomaly Detection and Bottleneck Identification
2015 (English)In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 48, no 1, 4Article in journal (Refereed) Published
Abstract [en]

In order to meet stringent performance requirements, system administrators must effectively detect undesirable performance behaviours, identify potential root causes and take adequate corrective measures. The problem of uncovering and understanding performance anomalies and their causes (bottlenecks) in different system and application domains is well studied. In order to assess progress, research trends and identify open challenges, we have reviewed major contributions in the area and present our findings in this survey. Our approach provides an overview of anomaly detection and bottleneck identification research as it relates to the performance of computing systems. By identifying fundamental elements of the problem, we are able to categorize existing solutions based on multiple factors such as the detection goals, nature of applications and systems, system observability, and detection methods.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2015
Keyword
Systems performance, performance anomaly detection, bottleneck detection, performance problem identification
National Category
Computer Systems
Research subject
Computer Systems
Identifiers
urn:nbn:se:umu:diva-105991 (URN)10.1145/2791120 (DOI)000363733200004 ()2-s2.0-84938363675 (Scopus ID)
Funder
Swedish Research Council, C0590801
Available from: 2015-07-03 Created: 2015-07-03 Last updated: 2017-12-04Bibliographically approved
2. Apex lake: a framework for enabling smart orchestration
Open this publication in new window or tab >>Apex lake: a framework for enabling smart orchestration
Show others...
2015 (English)In: Proceedings of the Industry Track of the 16th ACM/IFIP/USENIX Middleware Conference, New York, USA: Association for Computing Machinery (ACM), 2015, 1-7 p., 1Conference paper, Published paper (Refereed)
Abstract [en]

The introduction of a Software-defined infrastructures brings additional challenges to the management of cloud infrastructure. With the impending convergence of telecommunications and cloud infrastructures, datacenters become an essential part of an overall integrated environment. The potential scale of such environments has significant implications as traditional orchestration approaches cannot scale appropriately. However, the combination of infrastructure topology, fine-grained operational data and advanced analytics, has the potential to deliver a scalable approach to facilitate orchestration and resource management. In this paper we introduce Apex Lake, a framework designed to address the question of "how to efficiently define and maintain a physical and logical resource and service landscape enriched by operational data, to support orchestration for optimized service delivery?" We also demonstrate with a use-case illustrating how functionalities provided by Apex Lake can be used dealing with performance anomalies.

Place, publisher, year, edition, pages
New York, USA: Association for Computing Machinery (ACM), 2015
Keyword
Cloud monitoring and orchestration, Resource Management, Datacenter Management, Software-defined Infrastructure
National Category
Computer Systems
Research subject
Computer Systems
Identifiers
urn:nbn:se:umu:diva-114696 (URN)10.1145/2830013.2830016 (DOI)2-s2.0-84981340935 (Scopus ID)978-1-4503-3727-4 (ISBN)
Conference
16th ACM/IFIP/USENIX Middleware Conference, Middleware Industry 2015, Vancouver, Canada, 7 December 2015 through 11 December 2015
Available from: 2016-01-26 Created: 2016-01-26 Last updated: 2017-11-20Bibliographically approved
3. Performance Anomaly Detection using Datacenter Landscape Graphs
Open this publication in new window or tab >>Performance Anomaly Detection using Datacenter Landscape Graphs
(English)Manuscript (preprint) (Other academic)
National Category
Computer Science
Identifiers
urn:nbn:se:umu:diva-124577 (URN)
Available from: 2016-08-16 Created: 2016-08-16 Last updated: 2016-08-16

Open Access in DiVA

fulltext(1007 kB)263 downloads
File information
File name FULLTEXT02.pdfFile size 1007 kBChecksum SHA-512
6f7ddbeb8489b93962d235d17647d3d030f2d8767623fcd75621cc59fb771e3d823b9b80e34abf646ba0c41edd6fd6f42adf0c3cca0c37511c090798456d22eb
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ibidunmoye, Olumuyiwa
By organisation
Department of Computing Science
Computer Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 263 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1450 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf