umu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Elmroth, Erik
Alternative names
Publications (10 of 178) Show all publications
Vu, X.-S., Addi, A.-M., Elmroth, E. & Lili, J. (2019). Graph-based Interactive Data Federation System for Heterogeneous Data Retrieval and Analytics. In: Proceedings of The 30th TheWebConf'19 (formerly WWW), USA: . Paper presented at The Web Conference, San Fransisco, USA, May 13-17, 2019 (pp. 3595-3599). New York, NY, USA: ACM Digital Library
Open this publication in new window or tab >>Graph-based Interactive Data Federation System for Heterogeneous Data Retrieval and Analytics
2019 (English)In: Proceedings of The 30th TheWebConf'19 (formerly WWW), USA, New York, NY, USA: ACM Digital Library, 2019, p. 3595-3599Conference paper, Published paper (Refereed)
Abstract [en]

Given the increasing number of heterogeneous data stored in relational databases, file systems or cloud environment, it needs to be easily accessed and semantically connected for further data analytic. The potential of data federation is largely untapped, this paper presents an interactive data federation system (https://vimeo.com/ 319473546) by applying large-scale techniques including heterogeneous data federation, natural language processing, association rules and semantic web to perform data retrieval and analytics on social network data. The system first creates a Virtual Database (VDB) to virtually integrate data from multiple data sources. Next, a RDF generator is built to unify data, together with SPARQL queries, to support semantic data search over the processed text data by natural language processing (NLP). Association rule analysis is used to discover the patterns and recognize the most important co-occurrences of variables from multiple data sources. The system demonstrates how it facilitates interactive data analytic towards different application scenarios (e.g., sentiment analysis, privacyconcern analysis, community detection).

Place, publisher, year, edition, pages
New York, NY, USA: ACM Digital Library, 2019
Keywords
heterogeneous data federation, RDF, interactive data analysis
National Category
Language Technology (Computational Linguistics)
Identifiers
urn:nbn:se:umu:diva-160892 (URN)10.1145/3308558.3314138 (DOI)978-1-4503-6674-8 (ISBN)
Conference
The Web Conference, San Fransisco, USA, May 13-17, 2019
Available from: 2019-06-25 Created: 2019-06-25 Last updated: 2019-08-22Bibliographically approved
Nguyen, C. L., Klein, C. & Elmroth, E. (2019). Multivariate LSTM-based Location-aware Workload Prediction for Edge Data Centers. In: 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Larnaca, May 14-17, 2019: . Paper presented at 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (pp. 341-350).
Open this publication in new window or tab >>Multivariate LSTM-based Location-aware Workload Prediction for Edge Data Centers
2019 (English)In: 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, Larnaca, May 14-17, 2019, 2019, p. 341-350Conference paper, Published paper (Refereed)
Abstract [en]

Mobile Edge Clouds (MECs) is a promising computing platform to overcome challenges for the success of bandwidth-hungry, latency-critical applications by distributing computingand storage capacity in the edge of the network as Edge DataCenters (EDCs) within the close vicinity of end-users. Due tothe heterogeneous distributed resource capacity in EDCs, theapplication deployment flexibility coupled with the user mobility,MECs bring significant challenges to control resource allocationand provisioning. In order to develop a self-managed system forMECs which efficiently decides how much and when to activatescaling, where to place and migrate services, it is crucial to predictits workload characteristics, including variations over time andlocality.

To this end, we present a novel location-aware workloadpredictor for EDCs. Our approach leverages the correlationamong workloads of EDCs in a close physical distance andapplies multivariate Long Short-Term Memory (LSTM) networkto achieve on-line workload predictions for each EDC. Theexperiments with two real mobility traces show that our proposedapproach can achieve better prediction accuracy than a state-of-the art location-unaware method (up to 44%) and a location-aware method (up to 17%). Further, through an intensiveperformance measurement using various input shaking methods,we substantiate that the proposed approach achieves a reliableand consistent performance.

Keywords
Mobile Edge Cloud, Edge Data Center, ResourceManagement, Workload Prediction, Location-aware, MachineLearning
National Category
Computer Systems
Research subject
Computer Systems
Identifiers
urn:nbn:se:umu:diva-159540 (URN)
Conference
2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
Available from: 2019-05-30 Created: 2019-05-30 Last updated: 2019-06-12Bibliographically approved
Ibidunmoye, O., Ali-Reza, R. & Elmroth, E. (2018). Adaptive Anomaly Detection in Performance Metric Streams. IEEE Transactions on Network and Service Management, 15(1), 217-231
Open this publication in new window or tab >>Adaptive Anomaly Detection in Performance Metric Streams
2018 (English)In: IEEE Transactions on Network and Service Management, ISSN 1932-4537, E-ISSN 1932-4537, Vol. 15, no 1, p. 217-231Article in journal (Refereed) Published
Abstract [en]

Continuous detection of performance anomalies such as service degradations has become critical in cloud and Internet services due to impact on quality of service and end-user experience. However, the volume and fast changing behaviour of metric streams have rendered it a challenging task. Many diagnosis frameworks often rely on thresholding with stationarity or normality assumption, or on complex models requiring extensive offline training. Such techniques are known to be prone to spurious false-alarms in online settings as metric streams undergo rapid contextual changes from known baselines. Hence, we propose two unsupervised incremental techniques following a two-step strategy. First, we estimate an underlying temporal property of the stream via adaptive learning and, then we apply statistically robust control charts to recognize deviations. We evaluated our techniques by replaying over 40 time-series streams from the Yahoo! Webscope S5 datasets as well as 4 other traces of real web service QoS and ISP traffic measurements. Our methods achieve high detection accuracy and few false-alarms, and better performance in general compared to an open-source package for time-series anomaly detection.

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
Performance Monitoring and Measurement, Computer Network Management, Quality of Service, Time Series Analysis, Anomaly Detection, Unsupervised Learning
National Category
Computer Systems
Research subject
Computer Science; Computing Science; Computer Systems
Identifiers
urn:nbn:se:umu:diva-142030 (URN)10.1109/TNSM.2017.2750906 (DOI)000427420100016 ()
Projects
Cloud Control
Funder
Swedish Research Council, C0590801
Available from: 2017-11-17 Created: 2017-11-17 Last updated: 2018-08-07Bibliographically approved
Krzywda, J., Ali-Eldin, A., Wadbro, E., Östberg, P.-O. & Elmroth, E. (2018). ALPACA: Application Performance Aware Server Power Capping. In: ICAC 2018: 2018 IEEE International Conference on Autonomic Computing (ICAC), Trento, Italy, September 3-7, 2018. Paper presented at 15th IEEE International Conference on Autonomic Computing (ICAC 2018) (pp. 41-50). IEEE Computer Society
Open this publication in new window or tab >>ALPACA: Application Performance Aware Server Power Capping
Show others...
2018 (English)In: ICAC 2018: 2018 IEEE International Conference on Autonomic Computing (ICAC), Trento, Italy, September 3-7, 2018, IEEE Computer Society, 2018, p. 41-50Conference paper, Published paper (Refereed)
Abstract [en]

Server power capping limits the power consumption of a server to not exceed a specific power budget. This allows data center operators to reduce the peak power consumption at the cost of performance degradation of hosted applications. Previous work on server power capping rarely considers Quality-of-Service (QoS) requirements of consolidated services when enforcing the power budget. In this paper, we introduce ALPACA, a framework to reduce QoS violations and overall application performance degradation for consolidated services. ALPACA reduces unnecessary high power consumption when there is no performance gain, and divides the power among the running services in a way that reduces the overall QoS degradation when the power is scarce. We evaluate ALPACA using four applications: MediaWiki, SysBench, Sock Shop, and CloudSuite’s Web Search benchmark. Our experiments show that ALPACA reduces the operational costs of QoS penalties and electricity by up to 40% compared to a non optimized system. 

Place, publisher, year, edition, pages
IEEE Computer Society, 2018
Series
IEEE Conference Publication, ISSN 2474-0756
Keywords
power capping, performance degradation, power-performance tradeoffs
National Category
Computer Systems
Research subject
business data processing
Identifiers
urn:nbn:se:umu:diva-132428 (URN)10.1109/ICAC.2018.00014 (DOI)978-1-5386-5139-1 (ISBN)
Conference
15th IEEE International Conference on Autonomic Computing (ICAC 2018)
Available from: 2017-03-13 Created: 2017-03-13 Last updated: 2019-08-07Bibliographically approved
Mehta, A. & Elmroth, E. (2018). Distributed Cost-Optimized Placement for Latency-Critical Applications in Heterogeneous Environments. In: Proceedings of the IEEE 15th International Conference on Autonomic Computing (ICAC): . Paper presented at 2018 IEEE International Conference on Autonomic Computing, Trento, Italy, September 3-7, 2018 (pp. 121-130). IEEE Computer Society
Open this publication in new window or tab >>Distributed Cost-Optimized Placement for Latency-Critical Applications in Heterogeneous Environments
2018 (English)In: Proceedings of the IEEE 15th International Conference on Autonomic Computing (ICAC), IEEE Computer Society, 2018, p. 121-130Conference paper, Published paper (Refereed)
Abstract [en]

Mobile Edge Clouds (MECs) with 5G will create new opportunities to develop latency-critical applications in domains such as intelligent transportation systems, process automation, and smart grids. However, it is not clear how one can costefficiently deploy and manage a large number of such applications given the heterogeneity of devices, application performance requirements, and workloads. This work explores cost and performance dynamics for IoT applications, and proposes distributed algorithms for automatic deployment of IoT applications in heterogeneous environments. Placement algorithms were evaluated with respect to metrics including number of required runtimes, applications’ slowdown, and the number of iterations used to place an application. Iterative search-based distributed algorithms such as Size Interval Actor Assignment in Groups (SIAA G) outperformed random and bin packing algorithms, and are therefore recommended for this purpose. Size Interval Actor Assignment in Groups at Least Utilized Runtime (SIAA G LUR) algorithm is also recommended when minimizing the number of iterations is important. The tradeoff of using SIAA G algorithms is a few extra runtimes compared to bin packing algorithms.

Place, publisher, year, edition, pages
IEEE Computer Society, 2018
Series
Proceedings of the International Conference on Autonomic Computing, ISSN 2474-0764
Keywords
Mobile Edge Clouds, Fog Computing, IoTs, Distributed algorithms
National Category
Computer Systems
Identifiers
urn:nbn:se:umu:diva-151457 (URN)10.1109/ICAC.2018.00022 (DOI)978-1-5386-5139-1 (ISBN)
Conference
2018 IEEE International Conference on Autonomic Computing, Trento, Italy, September 3-7, 2018
Available from: 2018-09-04 Created: 2018-09-04 Last updated: 2019-06-26Bibliographically approved
Karakostas, V., Goumas, G., Bayuh Lakew, E., Elmroth, E., Gerangelos, S., Kolberg, S., . . . Koziris, N. (2018). Efficient Resource Management for Data Centers: The ACTiCLOUD Approach. In: : . Paper presented at SAMOS XVIII, July 15–19, 2018, Pythagorion, Samos Island, Greece.
Open this publication in new window or tab >>Efficient Resource Management for Data Centers: The ACTiCLOUD Approach
Show others...
2018 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Despite their proliferation as a dominant computing paradigm, cloud computing systems lack effective mechanisms to manage their vast resources efficiently. Resources are stranded and fragmented, limiting cloud applicability only to classes of applications that pose moderate resource demands. In addition, the need for reduced cost through consolidation introduces performance interference, as multiple VMs are co-located on the same nodes. To avoid such issues, current providers follow a rather conservative approach regarding resource management that leads to significant underutilization. ACTiCLOUD is a three-year Horizon 2020 project that aims at creating a novel cloud architecture that breaks existing scale-up and share-nothing barriers and enables the holistic management of physical resources, at both local and distributed cloud site levels. This extended abstract provides a brief overview of the resource management part of ACTiCLOUD, focusing on the design principles and the components.

Keywords
resource management, resource efficiency, cloud computing, data centers, in-memory databases, NUMA, heterogeneous, scale-up/out
National Category
Computer Systems
Identifiers
urn:nbn:se:umu:diva-154435 (URN)10.1145/3229631.3236095 (DOI)000475843000033 ()2-s2.0-85060997517 (Scopus ID)978-1-4503-6494-2 (ISBN)
Conference
SAMOS XVIII, July 15–19, 2018, Pythagorion, Samos Island, Greece
Available from: 2018-12-18 Created: 2018-12-18 Last updated: 2019-08-12Bibliographically approved
Bhuyan, M. H. & Elmroth, E. (2018). Multi-Scale Low-Rate DDoS Attack Detection Using the Generalized Total Variation Metric. In: 17th IEEE International Conference on Machine Learning and Applications: . Paper presented at 17th IEEE International Conference on Machine Learning and Applications, 2018, 17-20 December, Orlando, FL, USA (pp. 1040-1047). IEEE
Open this publication in new window or tab >>Multi-Scale Low-Rate DDoS Attack Detection Using the Generalized Total Variation Metric
2018 (English)In: 17th IEEE International Conference on Machine Learning and Applications, IEEE, 2018, p. 1040-1047Conference paper, Published paper (Refereed)
Abstract [en]

We propose a mechanism to detect multi-scale low-rate DDoS attacks which uses a generalized total variation metric. The proposed metric is highly sensitive towards detecting different variations in the network traffic and evoke more distance between legitimate and attack traffic as compared to the other detection mechanisms. Most low-rate attackers invade the security system by scale-in-and-out of periodic packet burst towards the bottleneck router which severely degrades the Quality of Service (QoS) of TCP applications. Our proposed mechanism can effectively identify attack traffic of this natures, despite its similarity to legitimate traffic, based on the spacing value of our metric. We evaluated our mechanism using datasets from CAIDA DDoS, MIT Lincoln Lab, and real-time testbed traffic. Our results demonstrate that our mechanism exhibits good accuracy and scalability in the detection of multi-scale low-rate DDoS attacks.

Place, publisher, year, edition, pages
IEEE, 2018
Keywords
Multi-scale, Distributed denial of service, Low-rate, Total variation metric
National Category
Computer Sciences
Research subject
Computer and Information Science
Identifiers
urn:nbn:se:umu:diva-155560 (URN)10.1109/ICMLA.2018.00170 (DOI)978-1-5386-6805-4 (ISBN)
Conference
17th IEEE International Conference on Machine Learning and Applications, 2018, 17-20 December, Orlando, FL, USA
Funder
The Kempe Foundations, SMK-1644
Available from: 2019-01-22 Created: 2019-01-22 Last updated: 2019-01-22Bibliographically approved
Krzywda, J., Ali-Eldin, A., Carlson, T. E., Östberg, P.-O. & Elmroth, E. (2018). Power-performance tradeoffs in data center servers: DVFS, CPUpinning, horizontal, and vertical scaling. Future generations computer systems, 81, 114-128
Open this publication in new window or tab >>Power-performance tradeoffs in data center servers: DVFS, CPUpinning, horizontal, and vertical scaling
Show others...
2018 (English)In: Future generations computer systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 81, p. 114-128Article in journal (Refereed) Published
Abstract [en]

Dynamic Voltage and Frequency Scaling (DVFS), CPU pinning, horizontal, and vertical scaling, are four techniques that have been proposed as actuators to control the performance and energy consumption on data center servers. This work investigates the utility of these four actuators, and quantifies the power-performance tradeoffs associated with them. Using replicas of the German Wikipedia running on our local testbed, we perform a set of experiments to quantify the influence of DVFS, vertical and horizontal scaling, and CPU pinning on end-to-end response time (average and tail), throughput, and power consumption with different workloads. Results of the experiments show that DVFS rarely reduces the power consumption of underloaded servers by more than 5%, but it can be used to limit the maximal power consumption of a saturated server by up to 20% (at a cost of performance degradation). CPU pinning reduces the power consumption of underloaded server (by up to 7%) at the cost of performance degradation, which can be limited by choosing an appropriate CPU pinning scheme. Horizontal and vertical scaling improves both the average and tail response time, but the improvement is not proportional to the amount of resources added. The load balancing strategy has a big impact on the tail response time of horizontally scaled applications.

Keywords
Power-performance tradeoffs, Dynamic Voltage and Frequency Scaling (DVFS), CPU pinning, Horizontal scaling, Vertical scaling
National Category
Computer Systems
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-132427 (URN)10.1016/j.future.2017.10.044 (DOI)000423652200010 ()2-s2.0-85033772481 (Scopus ID)
Note

Originally published in thesis in manuscript form.

Available from: 2017-03-13 Created: 2017-03-13 Last updated: 2019-07-02Bibliographically approved
Gonzalo P., R., Elmroth, E., Östberg, P.-O. & Ramakrishnan, L. (2018). ScSF: a scheduling simulation framework. In: Proceedings of the 21th Workshop on Job Scheduling Strategies for Parallel Processing: . Paper presented at 21th Workshop on Job Scheduling Strategies for Parallel Processing (JSSP 2017), Orlando FL, USA, June 2nd, 2017 (pp. 152-173). Springer, 10773
Open this publication in new window or tab >>ScSF: a scheduling simulation framework
2018 (English)In: Proceedings of the 21th Workshop on Job Scheduling Strategies for Parallel Processing, Springer, 2018, Vol. 10773, p. 152-173Conference paper, Published paper (Refereed)
Abstract [en]

High-throughput and data-intensive applications are increasingly present, often composed as workflows, in the workloads of current HPC systems. At the same time, trends for future HPC systems point towards more heterogeneous systems with deeper I/O and memory hierarchies. However, current HPC schedulers are designed to support classical large tightly coupled parallel jobs over homogeneous systems. Therefore, There is an urgent need to investigate new scheduling algorithms that can manage the future workloads on HPC systems. However, there is a lack of appropriate models and frameworks to enable development, testing, and validation of new scheduling ideas.

In this paper, we present an open-source scheduler simulation framework (ScSF) that covers all the steps of scheduling research through simulation. ScSF provides capabilities for workload modeling, workload generation, system simulation, comparative workload analysis, and experiment orchestration. The simulator is designed to be run over a distributed computing infrastructure enabling to test at scale. We describe in detail a use case of ScSF to develop new techniques to manage scientific workflows in a batch scheduler. In the use case, such technique was implemented in the framework scheduler. For evaluation purposes, 1728 experiments, equivalent to 33 years of simulated time, were run in a deployment of ScSF over a distributed infrastructure of 17 compute nodes during two months. Finally, the experimental results were analyzed in the framework to judge that the technique minimizes workflows’ turnaround time without over-allocating resources. Finally, we discuss lessons learned from our experiences that will help future researchers.

Place, publisher, year, edition, pages
Springer, 2018
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
slurm, simulation, scheduling, HPC, High Performance Computing, workload, generation, analysis
National Category
Computer Sciences
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-132981 (URN)10.1007/978-3-319-77398-8_9 (DOI)000444863700009 ()978-3-319-77397-1 (ISBN)978-3-319-77398-8 (ISBN)
Conference
21th Workshop on Job Scheduling Strategies for Parallel Processing (JSSP 2017), Orlando FL, USA, June 2nd, 2017
Funder
eSSENCE - An eScience CollaborationSwedish Research Council, C0590801
Note

Work also supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) and we used resources at the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility, supported by the Officece of Science of the U.S. Department of Energy, both under Contract No. DE-AC02-05CH11231.

Available from: 2017-03-27 Created: 2017-03-27 Last updated: 2018-10-05Bibliographically approved
Bayuh Lakew, E., Birke, R., Perez, J. F., Elmroth, E. & Chen, L. Y. (2018). SmallTail: Scaling Cores and Probabilistic Cloning Requests for Web Systems. In: 15TH IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING (ICAC 2018): . Paper presented at 15th IEEE International Conference on Autonomic Computing (ICAC), SEP 03-07, 2018, Trento, ITALY (pp. 31-40). IEEE
Open this publication in new window or tab >>SmallTail: Scaling Cores and Probabilistic Cloning Requests for Web Systems
Show others...
2018 (English)In: 15TH IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING (ICAC 2018), IEEE , 2018, p. 31-40Conference paper, Published paper (Refereed)
Abstract [en]

Users quality of experience on web systems are largely determined by the tail latency, e.g., 95th percentile. Scaling resources along, e.g., the number of virtual cores per VM, is shown to be effective to meet the average latency but falls short in taming the latency tail in the cloud where the performance variability is higher. The prior art shows the prominence of increasing the request redundancy to curtail the latency either in the off-line setting or without scaling-in cores of virtual machines. In this paper, we propose an opportunistic scaler, termed SmallTail, which aims to achieve stringent targets of tail latency while provisioning a minimum amount of resources and keeping them well utilized. Against dynamic workloads, SmallTail simultaneously adjusts the core provisioning per VM and probabilistically replicates requests so as to achieve the tail latency target. The core of SmallTail is a two level controller, where the outer loops controls the core provision per distributed VMs and the inner loop controls the clones in a finer granularity. We also provide theoretical analysis on the steady-state latency for a given probabilistic replication that clones one out of N arriving requests. We extensively evaluate SmallTail on three different web systems, namely web commerce, web searching, and web bulletin board. Our testbed results show that SmallTail can ensure the 95th latency below 1000 ms using up to 53% less cores compared to the strategy of constant cloning, whereas scaling-core only solution exceeds the latency target by up to 70%.

Place, publisher, year, edition, pages
IEEE, 2018
Series
Proceedings of the International Conference on Autonomic Computing, ISSN 2474-0756
National Category
Computer Systems
Identifiers
urn:nbn:se:umu:diva-155047 (URN)10.1109/ICAC.2018.00013 (DOI)000450120900004 ()978-1-5386-5139-1 (ISBN)
Conference
15th IEEE International Conference on Autonomic Computing (ICAC), SEP 03-07, 2018, Trento, ITALY
Available from: 2019-01-07 Created: 2019-01-07 Last updated: 2019-01-07Bibliographically approved
Organisations

Search in DiVA

Show all publications