Towards understanding HPC users and systems: a NERSC case study
(English)Manuscript (preprint) (Other academic)
The high performance computing (HPC) scheduling landscape is changing. Previously dominated by tightly coupled MPI jobs, HPC workloads are increasingly including high-throughput, data-intensive, and stream-processing applications. As a consequence, workloads are becoming more diverse at both application and job level, posing new challenges to classical HPC schedulers. There is a need to understand the current HPC workloads and their evolution towards the future in order to perform informed scheduling research and enable efficient scheduling in future HPC systems. In this paper, we present a methodology to characterize workloads and asses their heterogeneity, both for a particular time period and as they evolve over time. We apply this methodology to the workloads of three systems (Hopper, Edison, and Carver) at the National Energy Research Scientific Computing Center (NERSC). We present the resulting characterization of jobs, queues, heterogeneity, and performance that includes detailed information of a year of workload (2014) and evolution through the systems’ lifetime. Among the results, we highlight the observation of discontinuities in the jobs’ wait time for priority groups with high job diversity. Finally, we conclude by summarizing our analysis to establish a reference and inform future scheduling research.
workload analysis, supercomputer, HPC, scheduling, NERSC, heterogeneity, k-means
Research subject Computing Science
IdentifiersURN: urn:nbn:se:umu:diva-132980OAI: oai:DiVA.org:umu-132980DiVA: diva2:1084838
FundereSSENCE - An eScience CollaborationEU, Horizon 2020, 610711EU, Horizon 2020, 732667Swedish Research Council, C0590801