umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Decentralized scalable fairshare scheduling
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science. (UMIT)
2013 (English)In: Future generations computer systems, ISSN 0167-739X, Vol. 29, no 1, 130-143 p.Article in journal (Refereed) Published
Abstract [en]

This work addresses Grid fairshare allocation policy enforcement and presents Aequus, a decentralized system for Grid-wide fairshare job prioritization. The main idea of fairshare scheduling is to prioritize users with regard to predefined resource allocation quotas. The presented system builds on three contributions: a flexible tree-based policy model that allows delegation of policy definition, a job prioritization algorithm based on local enforcement of distributed fairshare policies, and a decentralized architecture for non-intrusive integration with existing scheduling systems. The system supports organization of users in virtual organizations and divides usage policies into local and global policy components that are defined by resource owners and virtual organizations. The architecture realization is presented in detail along with an evaluation of the system behavior in an emulated environment. In the evaluation, convergence noise types (mechanisms counteracting policy allocation convergence) are characterized and quantified, and the system is demonstrated to meet scheduling objectives and perform scalably under realistic operating conditions.

Place, publisher, year, edition, pages
Elsevier, 2013. Vol. 29, no 1, 130-143 p.
Keyword [en]
Grid scheduling, Fairshare scheduling, Grid allocation policy enforcement
National Category
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-40492DOI: 10.1016/j.future.2012.06.001OAI: oai:DiVA.org:umu-40492DiVA: diva2:399878
Available from: 2011-02-24 Created: 2011-02-24 Last updated: 2013-09-19Bibliographically approved
In thesis
1. Virtual infrastructures for computational science: software and architectures for distributed job and resource management
Open this publication in new window or tab >>Virtual infrastructures for computational science: software and architectures for distributed job and resource management
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Virtuella infrastrukturer för beräkningsvetenskap : programvaror och arkitekturer för distribuerad jobb- och resurshantering
Abstract [en]

In computational science, the scale of problems addressed and the resolution of solu- tions achieved are often limited by the available computational capacity. The current methodology of scaling computational capacity to large scale (i.e. larger than individ- ual resource site capacity) includes aggregation and federation of distributed resource systems. Regardless of how this aggregation manifests, scaling of scientific compu- tational problems typically involves (re)formulation of computational structures and problems to exploit problem and resource parallelism. Efficient parallelization and scaling of scientific computations to large scale is difficult and further complicated by a number of factors introduced by resource aggregation, e.g., resource heterogene- ity and coupling of computational methodology. Scaling complexity severely impacts computation enactment and necessitates the use of mechanisms that provide higher abstractions for management of computations in distributed computing environments.This work addresses design and construction of virtual infrastructures for scientific computation that abstract computation enactment complexity, decouple computation specification from computation enactment, and facilitate large-scale use of compu- tational resource systems. In particular, this thesis discusses job and resource man- agement in distributed virtual scientific infrastructures intended for Grid and Cloud computing environments. The main area studied is Grid computing, which is ap- proached using Service-Oriented Computing and Architecture methodology. Thesis contributions discuss both methodology and mechanisms for construction of virtual infrastructures, and address individual problems such as job management, application integration, scheduling job prioritization, and service-based software development.I addition to scientific publications, this work also makes contributions in the form of software artifacts that demonstrate the concepts discussed. The Grid Job Manage- ment Framework (GJMF) abstracts job enactment complexity and provides a range of middleware-agnostic job submission, control, and monitoring interfaces. The FSGrid framework provides a generic model for specification and delegation of resource allo- cations in virtual organizations, and enacts allocations based on distributed fairshare job prioritization. Mechanisms such as these decouple job and resource management from computational infrastructure systems and facilitate the construction of scalable virtual infrastructures for computational science.

Abstract [sv]

Inom beräkningsvetenskap begränsar ofta mängden tillgänglig beräkningskraft både storlek på problem som kan ansättas såväl som kvalitet på lösningar som kan uppnås. Metodik för skalning av beräkningskapacitet till stor skala (dvs större än kapaciteten hos enskilda resurscentras) baseras för närvarande på aggregering och federation av distribuerade beräkningsresurser. Oavsett hur denna resursaggregering tar sig uttryck tenderar skalning av vetenskapliga beräkningar till storskalig nivå att inkludera omformulering av problemställningar och beräkningsstrukturer för att bättre utnyttja problem- och resursparallellism. Effektiv parallellisering och skalning av vetenskapliga beräkningar är svårt och kompliceras ytterligare av faktorer som medföljer resursaggregering, t.ex. heterogeneitet i resursmiljöer och beroenden i programmeringsmodeller och beräkningsmetoder. Detta utbytesförhållande illustrerar komplexiteten i utförande av beräkningar och behovet av mekanismer som erbjuder högre abstraktionsnivåer för hantering av beräkningar i distribuerade beräkningsmiljöer.Denna avhandling diskuterar design och konstruktion av virtuella beräkningsinfrastrukturer som abstraherar komplexitet i utförande av beräkningar, frikopplar design av beräkningar från utförande av beräkningar samt underlättar storskalig användning av beräkningsresurser för vetenskapliga beräkningar. I synnerhet behandlas jobb- och resurshantering i distribuerade virtuella vetenskapliga infrastrukturer avsedda för Grid och Cloud computing miljöer. Det huvudsakliga området för avhandlingen är Grid computing, vilket adresseras med service-orienterad beräknings- och arkitekturmetodik. Arbetet diskuterar metodik och mekanismer för konstruktion av virtuella beräkningsinfrastrukturer samt gör bidrag inom enskilda områden som jobbhantering, applikationsintegrering, jobbprioritering och service-baserad programvaruutveckling.Utöver vetenskapliga publikationer bidrar detta arbete också med bidrag i form av programvarusystem som illustrerar de metoder som diskuteras. The Grid Job Management Framework (GJMF) abstraherar komplexitet i hantering av beräkningsjobb och erbjuder en uppsättning middleware-agnostiska gränssnitt för körning, kontroll och övervakning av beräkningsjobb i distribuerade beräkningsmiljöer. FSGrid erbjuder en generisk modell för specifikation och delegering av resurstilldelning i virtuella organisationer och grundar sig på distribuerad rättvisebaserad jobbprioritering. Mekanismer som dessa frikopplar jobb- och resurshantering från fysiska infrastruktursystem samt underlättar konstruktion av skalbara virtuella infrastrukturer för beräkningsvetenskap.

Place, publisher, year, edition, pages
Umeå: Institutionen för datavetenskap, Umeå universitet, 2011. 238 p.
Series
Report / UMINF, ISSN 0348-0542 ; 11.02
Identifiers
urn:nbn:se:umu:diva-42428 (URN)978-91-7459-194-1 (ISBN)
Public defence
2011-05-05, MIT-huset, MA121, Umeå universitet, Umeå, 13:30
Opponent
Supervisors
Available from: 2011-04-11 Created: 2011-04-07 Last updated: 2011-04-29Bibliographically approved
2. Metadata Management in Multi-Grids and Multi-Clouds
Open this publication in new window or tab >>Metadata Management in Multi-Grids and Multi-Clouds
2011 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Grid computing and cloud computing are two related paradigms used to access and use vast amounts of computational resources. The resources are often owned and managed by a third party, relieving the users from the costs and burdens of acquiring and managing a considerably large infrastructure themselves. Commonly, the resources are either contributed by different stakeholders participating in shared projects (grids), or owned and managed by a single entity and made available to its users with charging based on actual resource consumption (clouds). Individual grid or cloud sites can form collaborations with other sites, giving each site access to more resources that can be used to execute tasks submitted by users. There are several different models of collaborations between sites, each suitable for different scenarios and each posing additional requirements on the underlying technologies.

Metadata concerning the status and resource consumption of tasks are created during the execution of the task on the infrastructure. This metadata is used as the primary input in many core management processes, e.g., as a base for accounting and billing, as input when prioritizing and placing incoming task, and as a base for managing the amount of resources allocated to different tasks.

Focusing on management and utilization of metadata, this thesis contributes to a better understanding of the requirements and challenges imposed by different collaboration models in both grids and clouds. The underlying design criteria and resulting architectures of several software systems are presented in detail. Each system addresses different challenges imposed by cross-site grid and cloud architectures:

  • The LUTSfed approach provides a lean and optional mechanism for filtering and management of usage data between grid or cloud sites.

  • An accounting and billing system natively designed to support cross-site clouds demonstrates usage data management despite unknown placement and dynamic task resource allocation.

  • The FSGrid system enables fairshare job prioritization across different grid sites, mitigating the problems of heterogeneous scheduling software and local management policies.

The results and experiences from these systems are both theoretical and practical, as full scale implementations of each system has been developed and analyzed as a part of this work. Early theoretical work on structure-based service management forms a foundation for future work on structured-aware service placement in cross- site clouds. 

Place, publisher, year, edition, pages
Umeå: Umeå University, Department of Computing Science, 2011. 120 p.
Series
Report / UMINF, ISSN 0348-0542 ; 11.08
Keyword
grid computing, cloud computing, accounting, billing, metadata, monitoring, structure, fairshare, scheduling, federated
National Category
Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-51159 (URN)978-91-7459-281-8 (ISBN)
Presentation
2011-10-06, Naturvetarhuset, N320, Umeå universitet, Umeå, 13:00 (English)
Opponent
Supervisors
Available from: 2012-01-25 Created: 2012-01-11 Last updated: 2013-09-12Bibliographically approved
3. Enabling Technologies for Management of Distributed Computing Infrastructures
Open this publication in new window or tab >>Enabling Technologies for Management of Distributed Computing Infrastructures
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Computing infrastructures offer remote access to computing power that can be employed, e.g., to solve complex mathematical problems or to host computational services that need to be online and accessible at all times. From the perspective of the infrastructure provider, large amounts of distributed and often heterogeneous computer resources need to be united into a coherent platform that is then made accessible to and usable by potential users. Grid computing and cloud computing are two paradigms that can be used to form such unified computational infrastructures.

Resources from several independent infrastructure providers can be joined to form large-scale decentralized infrastructures. The primary advantage of doing this is that it increases the scale of the available resources, making it possible to address more complex problems or to run a greater number of services on the infrastructures. In addition, there are advantages in terms of factors such as fault-tolerance and geographical dispersion. Such multi-domain infrastructures require sophisticated management processes to mitigate the complications of executing computations and services across resources from different administrative domains.

This thesis contributes to the development of management processes for distributed infrastructures that are designed to support multi-domain environments. It describes investigations into how fundamental management processes such as scheduling and accounting are affected by the barriers imposed by multi-domain deployments, which include technical heterogeneity, decentralized and (domain-wise) self-centric decision making, and a lack of information on the state and availability of remote resources.

Four enabling technologies or approaches are explored and developed within this work: (I) The use of explicit definitions of cloud service structure as inputs for placement and management processes to ensure that the resulting placements respect the internal relationships between different service components and any relevant constraints. (II) Technology for the runtime adaptation of Virtual Machines to enable the automatic adaptation of cloud service contexts in response to changes in their environment caused by, e.g., service migration across domains. (III) Systems for managing meta-data relating to resource usage in multi-domain grid computing and cloud computing infrastructures. (IV) A global fairshare prioritization mechanism that enables computational jobs to be consistently prioritized across a federation of several decentralized grid installations.

Each of these technologies will facilitate the emergence of decentralized computational infrastructures capable of utilizing resources from diverse infrastructure providers in an automatic and seamless manner.

Place, publisher, year, edition, pages
Umeå: Umeå Universitet, 2013. 64 p.
Series
Report / UMINF, ISSN 0348-0542 ; 13.19
Keyword
grid computing, cloud computing, accounting, billing, contextualization, monitoring, structure, fairshare, scheduling, federated
National Category
Computer Science
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-80129 (URN)978-91-7459-704-2 (ISBN)
Public defence
2013-10-17, KBC-huset, Stora hörsalen KBC, KB3B1, Umeå Universitet, Umeå, 13:15 (English)
Opponent
Supervisors
Funder
EU, FP7, Seventh Framework Programme, 215605EU, FP7, Seventh Framework Programme, 257115Swedish Research Council, 621-2005-3667eSSENCE - An eScience Collaboration
Note

Note that the author changed surname from Henriksson to Espling in 2011

Available from: 2013-09-23 Created: 2013-09-10 Last updated: 2013-09-19Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Östberg, Per-OlovEspling, DanielElmroth, Erik

Search in DiVA

By author/editor
Östberg, Per-OlovEspling, DanielElmroth, Erik
By organisation
Department of Computing Science
In the same journal
Future generations computer systems
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 333 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf