umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Priority operators for fairshare scheduling
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2015 (English)In: Job scheduling strategies for parallel processing (JSSPP 2014), 2015, 70-89 p.Conference paper, Published paper (Refereed)
Abstract [en]

Collaborative resource sharing in distributed computing requires scalable mechanisms for allocation and control of user quotas. Decentralized fairshare prioritization is a technique for enforcement of user quotas that can be realized without centralized control. The technique is based on influencing the job scheduling order of local resource management systems using an algorithm that establishes a semantic for prioritization of jobs based on the individual distances between user's quota allocations and user's historical resource usage (i.e. intended and current system state). This work addresses the design and evaluation of priority operators, mathematical functions to quantify fairshare distances, and identify a set of desirable characteristics for fairshare priority operators. In addition, this work also proposes a set of operators for fairshare prioritization, establishes a methodology for verification and evaluation of operator characteristics, and evaluates the proposed operator set based on this mathematical framework. Limitations in the numerical representation of scheduling factor values are identified as a key challenge in priority operator formulation, and it is demonstrated that the contributed priority operators (the Sigmoid operator family) behave robustly even in the presence of severe resolution limitations.

Place, publisher, year, edition, pages
2015. 70-89 p.
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 8828
National Category
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-106517DOI: 10.1007/978-3-319-15789-4_5ISI: 000355729800005ISBN: 978-3-319-15788-7 (print)ISBN: 978-3-319-15789-4 (electronic)OAI: oai:DiVA.org:umu-106517DiVA: diva2:841913
Conference
18th International Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP), MAY 23, 2014, Phoenix, AZ
Available from: 2015-07-15 Created: 2015-07-14 Last updated: 2017-03-28Bibliographically approved
In thesis
1. HPC scheduling in a brave new world
Open this publication in new window or tab >>HPC scheduling in a brave new world
2017 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Many breakthroughs in scientific and industrial research are supported by simulations and calculations performed on high performance computing (HPC) systems. These systems typically consist of uniform, largely parallel compute resources and high bandwidth concurrent file systems interconnected by low latency synchronous networks. HPC systems are managed by batch schedulers that order the execution of application jobs to maximize utilization while steering turnaround time. In the past, demands for greater capacity were met by building more powerful systems with more compute nodes, greater transistor densities, and higher processor operating frequencies. Unfortunately, the scope for further increases in processor frequency is restricted by the limitations of semiconductor technology. Instead, parallelism within processors and in numbers of compute nodes is increasing, while the capacity of single processing units remains unchanged. In addition, HPC systems’ memory and I/O hierarchies are becoming deeper and more complex to keep up with the systems’ processing power. HPC applications are also changing: the need to analyze large data sets and simulation results is increasing the importance of data processing and data-intensive applications. Moreover, composition of applications through workflows within HPC centers is becoming increasingly important. This thesis addresses the HPC scheduling challenges created by such new systems and applications. It begins with a detailed analysis of the evolution of the workloads of three reference HPC systems at the National Energy Research Supercomputing Center (NERSC), with a focus on job heterogeneity and scheduler performance. This is followed by an analysis and improvement of a fairshare prioritization mechanism for HPC schedulers. The thesis then surveys the current state of the art and expected near-future developments in HPC hardware and applications, and identifies unaddressed scheduling challenges that they will introduce. These challenges include application diversity and issues with workflow scheduling or the scheduling of I/O resources to support applications. Next, a cloud-inspired HPC scheduling model is presented that can accommodate application diversity, takes advantage of malleable applications, and enables short wait times for applications. Finally, to support ongoing scheduling research, an open source scheduling simulation framework is proposed that allows new scheduling algorithms to be implemented and evaluated in a production scheduler using workloads modeled on those of a real system. The thesis concludes with the presentation of a workflow scheduling algorithm to minimize workflows’ turnaround time without over-allocating resources.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2017. 122 p.
Series
Report / UMINF, ISSN 0348-0542 ; 17.05
Keyword
High Performance Computing, HPC, supercomputing, scheduling, workflows, workloads, exascale
National Category
Computer Science
Research subject
Computing Science
Identifiers
urn:nbn:se:umu:diva-132983 (URN)978-91-7601-693-0 (ISBN)
Public defence
2017-04-21, MA121, MIT-Huset, Umeå Universitet, Umeå, 10:15 (English)
Opponent
Supervisors
Funder
eSSENCE - An eScience CollaborationSwedish Research Council, C0590801EU, Horizon 2020, 610711EU, FP7, Seventh Framework Programme, 732667
Note

Work also supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research (ASCR) and we used resources at the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility, supported by the Officece of Science of the U.S. Department of Energy, both under Contract No. DE-AC02-05CH11231.

Available from: 2017-03-29 Created: 2017-03-27 Last updated: 2017-05-29Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Rodrigo, Gonzalo P.Östberg, Per-OlovElmroth, Erik

Search in DiVA

By author/editor
Rodrigo, Gonzalo P.Östberg, Per-OlovElmroth, Erik
By organisation
Department of Computing Science
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 111 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf