Change search
ReferencesLink to record
Permanent link

Direct link
How will your workload look like in 6 years?: Analyzing Wikimedia's workload
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Cloud and Grid Computing Cloud and grid computing)
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.
Show others and affiliations
2014 (English)In: Proceedings of the 2014 IEEE International Conference on Cloud Engineering (IC2E 2014) / [ed] Lisa O’Conner, IEEE Computer Society, 2014, 349-354 p.Conference paper (Refereed)
Abstract [en]

Accurate understanding of workloads is key to efficient cloud resource management as well as to the design of large-scale applications. We analyze and model the workload of Wikipedia, one of the world's largest web sites. With descriptive statistics, time-series analysis, and polynomial splines, we study the trend and seasonality of the workload, its evolution over the years, and also investigate patterns in page popularity. Our results indicate that the workload is highly predictable with a strong seasonality. Our short term prediction algorithm is able to predict the workload with a Mean Absolute Percentage Error of around 2%.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014. 349-354 p.
, IEEE, ISSN 2373-3845
National Category
Other Electrical Engineering, Electronic Engineering, Information Engineering Computer Systems
Research subject
Computing Science
URN: urn:nbn:se:umu:diva-87235DOI: 10.1109/IC2E.2014.50ISI: 000361018600043ISBN: 978-1-4799-3766-0OAI: diva2:707725
IC2E 2014, IEEE International Conference on Cloud Engineering, Boston, Massachusetts, 11-14 March 2014
Swedish Research Council, C0590801eSSENCE - An eScience Collaboration
Available from: 2014-03-25 Created: 2014-03-25 Last updated: 2015-10-06Bibliographically approved
In thesis
1. Workload characterization, controller design and performance evaluation for cloud capacity autoscaling
Open this publication in new window or tab >>Workload characterization, controller design and performance evaluation for cloud capacity autoscaling
2015 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis studies cloud capacity auto-scaling, or how to provision and release re-sources to a service running in the cloud based on its actual demand using an auto-matic controller. As the performance of server systems depends on the system design,the system implementation, and the workloads the system is subjected to, we focuson these aspects with respect to designing auto-scaling algorithms. Towards this goal,we design and implement two auto-scaling algorithms for cloud infrastructures. Thealgorithms predict the future load for an application running in the cloud. We discussthe different approaches to designing an auto-scaler combining reactive and proactivecontrol methods, and to be able to handle long running requests, e.g., tasks runningfor longer than the actuation interval, in a cloud. We compare the performance ofour algorithms with state-of-the-art auto-scalers and evaluate the controllers’ perfor-mance with a set of workloads. As any controller is designed with an assumptionon the operating conditions and system dynamics, the performance of an auto-scalervaries with different workloads.In order to better understand the workload dynamics and evolution, we analyze a6-years long workload trace of the sixth most popular Internet website. In addition,we analyze a workload from one of the largest Video-on-Demand streaming servicesin Sweden. We discuss the popularity of objects served by the two services, the spikesin the two workloads, and the invariants in the workloads. We also introduce, a mea-sure for the disorder in a workload, i.e., the amount of burstiness. The measure isbased on Sample Entropy, an empirical statistic used in biomedical signal processingto characterize biomedical signals. The introduced measure can be used to charac-terize the workloads based on their burstiness profiles. We compare our introducedmeasure with the literature on quantifying burstiness in a server workload, and showthe advantages of our introduced measure.To better understand the tradeoffs between using different auto-scalers with differ-ent workloads, we design a framework to compare auto-scalers and give probabilisticguarantees on the performance in worst-case scenarios. Using different evaluation cri-teria and more than 700 workload traces, we compare six state-of-the-art auto-scalersthat we believe represent the development of the field in the past 8 years. Knowingthat the auto-scalers’ performance depends on the workloads, we design a workloadanalysis and classification tool that assigns a workload to its most suitable elasticitycontroller out of a set of implemented controllers. The tool has two main components;an analyzer, and a classifier. The analyzer analyzes a workload and feeds the analysisresults to the classifier. The classifier assigns a workload to the most suitable elasticitycontroller based on the workload characteristics and a set of predefined business levelobjectives. The tool is evaluated with a set of collected real workloads, and a set ofgenerated synthetic workloads. Our evaluation results shows that the tool can help acloud provider to improve the QoS provided to the customers.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2015. 16 p.
Report / UMINF, ISSN 0348-0542 ; 15.09
cloud computing, autoscaling, workloads, performance modeling, controller design
National Category
Computer Systems
urn:nbn:se:umu:diva-108398 (URN)978-91-7601-330-4 (ISBN)
Public defence
2015-10-02, N360, Naturveterhuset Building, Umeå University, Umeå, 14:00 (English)
EU, European Research CouncilSwedish Research Council
Available from: 2015-09-11 Created: 2015-09-10 Last updated: 2015-10-07Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Ali-Eldin, AhmedRezaie, AliMehta, AmardeepRazroev, StanislavSjöstedt-de Luna, SaraSeleznjev, OlegTordsson, JohanElmroth, Erik
By organisation
Department of Computing ScienceDepartment of Mathematics and Mathematical Statistics
Other Electrical Engineering, Electronic Engineering, Information EngineeringComputer Systems

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 679 hits
ReferencesLink to record
Permanent link

Direct link