umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Cloud Service Resilience using Brownout-Aware Load-Balancing
Umeå University, Faculty of Science and Technology, Department of Computing Science. (DS)ORCID iD: 0000-0003-0106-3049
Lund University, Sweden.ORCID iD: 0000-0002-1364-8127
Lund University, Sweden.
Lund University, Sweden.
Show others and affiliations
2014 (English)In: 2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), IEEE Computer Society, 2014, 31-40 p.Conference paper, Published paper (Refereed)
Abstract [en]

We focus on improving resilience of cloud services (e.g., e-commerce website), when correlated or cascading failures lead to computing capacity shortage. We study how to extend the classical cloud service architecture composed of a load-balancer and replicas with a recently proposed self-adaptive paradigm called brownout. Such services are able to reduce their capacity requirements by degrading user experience (e.g., disabling recommendations).

Combining resilience with the brownout paradigm is to date an open practical problem. The issue is to ensure that replica self-adaptivity would not confuse the load-balancing algorithm, overloading replicas that are already struggling with capacity shortage. For example, load-balancing strategies based on response times are not able to decide which replicas should be selected, since the response times are already controlled by the brownout paradigm.

In this paper we propose two novel brownout-aware load-balancing algorithms. To test their practical applicability, we extended the popular lighttpd web server and load-balancer, thus obtaining a production-ready implementation. Experimental evaluation shows that the approach enables cloud services to remain responsive despite cascading failures. Moreover, when compared to Shortest Queue First (SQF), believed to be near-optimal in the non-adaptive case, our algorithms improve user experience by 5%, with high statistical significance, while preserving response time predictability.

Place, publisher, year, edition, pages
IEEE Computer Society, 2014. 31-40 p.
Series
Symposium on Reliable Distributed Systems Proceedings, ISSN 1060-9857
Keyword [en]
cloud, load-balancing, self-adaptation, control theory, statistical evaluation
National Category
Computer Systems Control Engineering
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-91327ISI: 000380439400004ISBN: 978-1-4799-5584-8 (print)OAI: oai:DiVA.org:umu-91327DiVA: diva2:735567
Conference
2014 IEEE 33RD INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS (SRDS), OCT 06-09, 2014, Nara, JAPAN
Projects
Cloud Control
Funder
Swedish Research CouncilELLIIT - The Linköping‐Lund Initiative on IT and Mobile CommunicationsLinnaeus research environment CADICS
Available from: 2014-07-29 Created: 2014-07-29 Last updated: 2017-01-16Bibliographically approved

Open Access in DiVA

No full text

Authority records BETA

Klein, CristianPapadopoulos, Alessandro VittorioMaggio, MartinaHernández-Rodriguez, FranciscoElmroth, Erik

Search in DiVA

By author/editor
Klein, CristianPapadopoulos, Alessandro VittorioMaggio, MartinaHernández-Rodriguez, FranciscoElmroth, Erik
By organisation
Department of Computing Science
Computer SystemsControl Engineering

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 105 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf