umu.sePublications
Change search
Refine search result
1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ibidunmoye, Olumuyiwa
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Performance anomaly detection and resolution for autonomous clouds2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Fundamental properties of cloud computing such as resource sharing and on-demand self-servicing is driving a growing adoption of the cloud for hosting both legacy and new application services. A consequence of this growth is that the increasing scale and complexity of the underlying cloud infrastructure as well as the fluctuating service workloads is inducing performance incidents at a higher frequency than ever before with far-reaching impact on revenue, reliability, and reputation. Hence, effectively managing performance incidents with emphasis on timely detection, diagnosis and resolution has thus become a necessity rather than luxury. While other aspects of cloud management such as monitoring and resource management are experiencing greater automation, automated management of performance incidents remains a major concern.

    Given the volume of operational data produced by cloud datacenters and services, this thesis focus on how data analytics techniques can be used in the aspect of cloud performance management. In particular, this work investigates techniques and models for automated performance anomaly detection and prevention in cloud environments. To familiarize with developments in the research area, we present the outcome of an extensive survey of existing research contributions addressing various aspects of performance problem management in diverse systems domains. We discuss the design and evaluation of analytics models and algorithms for detecting performance anomalies in real-time behaviour of cloud datacenter resources and hosted services at different resolutions. We also discuss the design of a semi-supervised machine learning approach for mitigating performance degradation by actively driving quality of service from undesirable states to a desired target state via incremental capacity optimization. The research methods used in this thesis include experiments on real virtualized testbeds to evaluate aspects of proposed techniques while other aspects are evaluated using performance traces from real-world datacenters.

    Insights and outcomes from this thesis can be used by both cloud and service operators to enhance the automation of performance problem detection, diagnosis and resolution. They also have the potential to spur further research in the area while being applicable in related domains such as Internet of Things (IoT), industrial sensors as well as in edge and mobile clouds.

  • 2.
    Ibidunmoye, Olumuyiwa
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Performance problem diagnosis in cloud infrastructures2016Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Cloud datacenters comprise hundreds or thousands of disparate application services, each having stringent performance and availability requirements, sharing a finite set of heterogeneous hardware and software resources. The implication of such complex environment is that the occurrence of performance problems, such as slow application response and unplanned downtimes, has become a norm rather than exception resulting in decreased revenue, damaged reputation, and huge human-effort in diagnosis. Though causes can be as varied as application issues (e.g. bugs), machine-level failures (e.g. faulty server), and operator errors (e.g. mis-configurations), recent studies have attributed capacity-related issues, such as resource shortage and contention, as the cause of most performance problems on the Internet today. As cloud datacenters become increasingly autonomous there is need for automated performance diagnosis systems that can adapt their operation to reflect the changing workload and topology in the infrastructure. In particular, such systems should be able to detect anomalous performance events, uncover manifestations of capacity bottlenecks, localize actual root-cause(s), and possibly suggest or actuate corrections.

    This thesis investigates approaches for diagnosing performance problems in cloud infrastructures. We present the outcome of an extensive survey of existing research contributions addressing performance diagnosis in diverse systems domains. We also present models and algorithms for detecting anomalies in real-time application performance and identification of anomalous datacenter resources based on operational metrics and spatial dependency across datacenter components. Empirical evaluations of our approaches shows how they can be used to improve end-user experience, service assurance and support root-cause analysis. 

  • 3.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ali-Reza, Rezaie
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Adaptive Anomaly Detection in Performance Metric Streams2018In: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 15, no 1, p. 217-231Article in journal (Refereed)
    Abstract [en]

    Continuous detection of performance anomalies such as service degradations has become critical in cloud and Internet services due to impact on quality of service and end-user experience. However, the volume and fast changing behaviour of metric streams have rendered it a challenging task. Many diagnosis frameworks often rely on thresholding with stationarity or normality assumption, or on complex models requiring extensive offline training. Such techniques are known to be prone to spurious false-alarms in online settings as metric streams undergo rapid contextual changes from known baselines. Hence, we propose two unsupervised incremental techniques following a two-step strategy. First, we estimate an underlying temporal property of the stream via adaptive learning and, then we apply statistically robust control charts to recognize deviations. We evaluated our techniques by replaying over 40 time-series streams from the Yahoo! Webscope S5 datasets as well as 4 other traces of real web service QoS and ISP traffic measurements. Our methods achieve high detection accuracy and few false-alarms, and better performance in general compared to an open-source package for time-series anomaly detection.

  • 4.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Blackbox Strategies for Detecting Service Performance Anomalies in Virtualized Environments2016Report (Other academic)
    Abstract [en]

    In order to prevent violation of service-level objectives and to guarantee good user experience, detection of symptoms such as slow application response, degraded transaction throughput, and service outages, is crucial. We propose a black-box approach for detecting such symptoms in service performance behaviour without intrusive application instrumentation. In case a known baseline behaviour exists, we employ kernel density estimation to discover deviations from a given set of baseline measurements. Conversely, when no baseline exists, we apply statistical process control charts on prediction errors obtained from Holt-Winter’s double exponential smoothing to identify anomalies in metric time-series. We evaluate our methods on tail response times traces collected from experiments conducted in a real testbed under realistic load and fault injections. Results show the applicability of our approach for improving service assurance and also demonstrate how service level anomalies correlate with system-level events such as resource contention and bottlenecks.

  • 5.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Francisco, Hernandez-Rodriguez
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Performance Anomaly Detection and Bottleneck Identification2015In: ACM Computing Surveys, ISSN 0360-0300, E-ISSN 1557-7341, Vol. 48, no 1, article id 4Article in journal (Refereed)
    Abstract [en]

    In order to meet stringent performance requirements, system administrators must effectively detect undesirable performance behaviours, identify potential root causes and take adequate corrective measures. The problem of uncovering and understanding performance anomalies and their causes (bottlenecks) in different system and application domains is well studied. In order to assess progress, research trends and identify open challenges, we have reviewed major contributions in the area and present our findings in this survey. Our approach provides an overview of anomaly detection and bottleneck identification research as it relates to the performance of computing systems. By identifying fundamental elements of the problem, we are able to categorize existing solutions based on multiple factors such as the detection goals, nature of applications and systems, system observability, and detection methods.

  • 6.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Lakew, Ewnetu Bayuh
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    A Black-box Approach for Detecting Systems Anomalies in Virtualized Environments2017In: 2017 IEEE International Conference on Cloud and Autonomic Computing (ICCAC 2017), IEEE, 2017, p. 22-33Conference paper (Refereed)
    Abstract [en]

    Virtualization technologies allow cloud providers to optimize server utilization and cost by co-locating services in as few servers as possible. Studies have shown how applications in multi-tenant environments are susceptible to systems anomalies such as abnormal resource usage due to performance interference. Effective detection of such anomalies requires techniques that can adapt autonomously with dynamic service workloads, require limited instrumentation to cope with diverse applications services, and infer relationship between anomalies non-intrusively to avoid "alarm fatigue" due to scale. We propose a black-box framework that includes an unsupervised prediction-based mechanism for automated anomaly detection in multi-dimensional resource behaviour of datacenter nodes and a graph-theoretic technique for ranking anomalous nodes across the datacenter. The proposed framework is evaluated using resource traces of over 100 virtual machines obtained from a production cluster as well as traces obtained from an experimental testbed under realistic service composition. The technique achieve average normalized root mean squared forecast error and R^2 of (0.92, 0.07) across hosts servers and (0.70, 0.39) across virtual machines. Also, the average detection rate is 88% while explaining 62% of SLA violations with an average lead-time of 6 time-points when the testbed is actively perturbed under three contention scenarios. 

  • 7.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Metsch, T.
    Bayon-Molino, V.
    Elmroth, E.
    Performance Anomaly Detection using Datacenter Landscape GraphsManuscript (preprint) (Other academic)
  • 8.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Metsch, T.
    Elmroth, E.
    Real-time Detection of Performance Anomalies for Cloud ServiceManuscript (preprint) (Other academic)
  • 9.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Metsch, Thijs
    Intel Labs Europe, Collingstown Industrial Park, Leixlip, Ireland.
    Bayon-Molino, Victor
    Intel Labs Europe, Collingstown Industrial Park, Leixlip, Ireland.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Performance Anomaly Detection using Datacenter Landscape Graphs2017Conference paper (Other academic)
    Abstract [en]

    The migration of mission-critical workloads to the cloud and the automation of various aspects of datacenter management is contributing to the evolution of software-defined infrastructures. One implication of this evolution is that the composition (both physical and virtual) and logical topology of datacenters is becoming even more dynamic. Identification of performance problems (e.g.\ bottlenecks) in such environments needs to be done with awareness of this dynamic topology to understand the impact of dependencies among components. A technique is introduced that a) employs expert knowledge to identify bottleneck components using associated performance metrics, and b) utilizes dynamic dependencies to rank problem components in order to facilitate diagnosis efforts. The technique is demonstrated experimentally on an OpenStack testbed with realistic fault injection. Results of experiment case studies show that the technique is able to correctly detect and rank problem nodes. 

  • 10.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Metsch, Thijs
    Intel Labs Europe, Ireland.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Real-time Detection of Performance Anomalies for Cloud Services2016In: 2016 IEEE/ACM 24th International Symposium on Quality of Service (IWQoS 2016), IEEE Communications Society, 2016, p. 164-165Conference paper (Refereed)
    Abstract [en]

    We propose two adaptive techniques for detecting anomalies in real-time service performance measurements. The techniques yielded low false alarm rates when evaluated on multiple time-series from the Yahoo! Webscope anomaly detection traces. 

  • 11.
    Ibidunmoye, Olumuyiwa
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Moghadam, Mahshid Helali
    Department of Computer Engineering, University of Kashan.
    Lakew, Ewnetu Bayuh
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Adaptive Service Performance Control using Cooperative Fuzzy Reinforcement Learning in Virtualized Environments2017In: UCC '17 Proceedings of the10th International Conference on Utility and Cloud Computing, IEEE/ACM , 2017, p. 19-28Conference paper (Refereed)
    Abstract [en]

    Designing efficient control mechanisms to meet strict performance requirements with respect tochanging workload demands without sacrificing resource efficiency remains a challenge in cloudinfrastructures. A popular approach is fine-grained resource provisioning via auto-scaling mechanisms that rely on either threshold-based adaptation rules or sophisticated queuing/control-theoretic models. While it is difficult at design time to specify optimal threshold rules, it is even more challenging inferring precise performance models for the multitude of services. Recently, reinforcement learning have been applied to address this challenge. However, such approaches require many learning trials to stabilize at the beginning and when operational conditions vary thereby limiting their application under dynamic workloads. To this end, we extend the standard reinforcement learning approach in two ways: a) we formulate the system state as a fuzzy space and b) exploit a set of cooperative agents to explore multiple fuzzy states in parallel to speed up learning. Through multiple experiments on a real virtualized testbed, we demonstrate that our approach converges quickly, meets performance targets at high efficiency without explicit service models.

  • 12.
    Metsch, Thijs
    et al.
    Intel Labs Europe, Intel Ireland.
    Ibidunmoye, Olumuyiwa
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Bayon-Molino, Victor
    Intel Labs Europe, Intel Ireland.
    Butler, Joe
    Intel Labs Europe, Intel Ireland.
    Hernández-Rodriguez, Francisco
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Apex lake: a framework for enabling smart orchestration2015In: Proceedings of the Industry Track of the 16th ACM/IFIP/USENIX Middleware Conference, New York, USA: Association for Computing Machinery (ACM), 2015, p. 1-7, article id 1Conference paper (Refereed)
    Abstract [en]

    The introduction of a Software-defined infrastructures brings additional challenges to the management of cloud infrastructure. With the impending convergence of telecommunications and cloud infrastructures, datacenters become an essential part of an overall integrated environment. The potential scale of such environments has significant implications as traditional orchestration approaches cannot scale appropriately. However, the combination of infrastructure topology, fine-grained operational data and advanced analytics, has the potential to deliver a scalable approach to facilitate orchestration and resource management. In this paper we introduce Apex Lake, a framework designed to address the question of "how to efficiently define and maintain a physical and logical resource and service landscape enriched by operational data, to support orchestration for optimized service delivery?" We also demonstrate with a use-case illustrating how functionalities provided by Apex Lake can be used dealing with performance anomalies.

1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf