Umeå universitets logga

umu.sePublikationer
Ändra sökning
Länk till posten
Permanent länk

Direktlänk
Publikationer (3 of 3) Visa alla publikationer
Chen, Y., de Oliveira Castro, P., Bientinesi, P., Jansson, N. & Iakymchuk, R. (2026). Enabling mixed-precision in spectral element codes. Future Generation Computer Systems, 174, Article ID 107990.
Öppna denna publikation i ny flik eller fönster >>Enabling mixed-precision in spectral element codes
Visa övriga...
2026 (Engelska)Ingår i: Future Generation Computer Systems, ISSN 0167-739X, E-ISSN 1872-7115, Vol. 174, artikel-id 107990Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we propose a methodology for enabling mixed-precision with the help of computer arithmetic tools, roofline model, and computer arithmetic techniques. As case studies, we consider Nekbone (Nek5000 developers), a mini-application for the Computational Fluid Dynamics (CFD) solver Nek5000 (Fischer et al.), and a modern Neko (Jansson et al., 2024) CFD application. With the help of the Verificarlo (Denis et al., 2016) tool and computer arithmetic techniques, we introduce a strategy to address stagnation issues in the preconditioned Conjugate Gradient method in Nekbone and apply these insights to implement a mixed-precision version of Neko. We evaluate the derived mixed-precision versions of these codes by combining metrics in three dimensions: accuracy, time-to-solution, and energy-to-solution. Notably, mixed-precision in Nekbone reduces time-to-solution by roughly 1.62x and energy-to-solution by 2.43x on MareNostrum 5, while in the real-world Neko application, the gain is up to 1.3x in both time and energy, with the accuracy that matches double-precision results.

Ort, förlag, år, upplaga, sidor
Elsevier, 2026
Nyckelord
Computer arithmetic tool, Conjugate gradient, Energy-to-solution, Mixed-precision, Neko, Roofline model, Verificarlo
Nationell ämneskategori
Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:umu:diva-242183 (URN)10.1016/j.future.2025.107990 (DOI)2-s2.0-105009726439 (Scopus ID)
Tillgänglig från: 2025-07-14 Skapad: 2025-07-14 Senast uppdaterad: 2025-07-14Bibliografiskt granskad
Chen, Y., Castro, P. d., Bientinesi, P. & Iakymchuk, R. (2025). Enabling mixed-precision with the help of tools: a nekbone case study. In: Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski (Ed.), Parallel processing and applied mathematics: 15Th International Conference, Ppam 2024, Ostrava, Czech Republic, September 8–11, 2024, Revised Selected Papers, Part I. Paper presented at 15th International Conference on Parallel Processing and Applied Mathematics, PPAM 2024, Ostrava, Czech Republic, September 8–11, 2024 (pp. 34-50). Cham: Springer Nature
Öppna denna publikation i ny flik eller fönster >>Enabling mixed-precision with the help of tools: a nekbone case study
2025 (Engelska)Ingår i: Parallel processing and applied mathematics: 15Th International Conference, Ppam 2024, Ostrava, Czech Republic, September 8–11, 2024, Revised Selected Papers, Part I / [ed] Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski, Cham: Springer Nature, 2025, s. 34-50Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we consider Nekbone, a mini-application for the Computational Fluid Dynamics (CFD) solver Nek5000, as a case study, and propose a methodology for enabling mixed-precision with the help of computer arithmetic tools and roofline model. We evaluate the derived mixed-precision program by combining metrics in three dimensions: accuracy, time-to-solution, and energy-to-solution. Notably, the introduction of mixed-precision in Nekbone, reducing time-to-solution by 40.7% and energy-to-solution by 47% on 128 MPI ranks without sacrificing the accuracy.

Ort, förlag, år, upplaga, sidor
Cham: Springer Nature, 2025
Serie
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15579
Nyckelord
computer arithmetic tool, Conjugate Gradient, energy-to-solution, Mixed-precision, Nekbone, roofline model, Verificarlo
Nationell ämneskategori
Datavetenskap (datalogi) Beräkningsmatematik
Identifikatorer
urn:nbn:se:umu:diva-238100 (URN)10.1007/978-3-031-85697-6_3 (DOI)2-s2.0-105002711656 (Scopus ID)9783031856969 (ISBN)
Konferens
15th International Conference on Parallel Processing and Applied Mathematics, PPAM 2024, Ostrava, Czech Republic, September 8–11, 2024
Tillgänglig från: 2025-05-05 Skapad: 2025-05-05 Senast uppdaterad: 2025-05-05Bibliografiskt granskad
Gedik, G., Kulkarni, K., Chen, Y., Kempf, D., Kemmler, S., Papageorgiou, D., . . . Iakymchuk, R. (2024). Best practice guide – harvesting energy consumption on european HPC systems: sharing experience from the CEEC project. The CEEC Consortium Partners
Öppna denna publikation i ny flik eller fönster >>Best practice guide – harvesting energy consumption on european HPC systems: sharing experience from the CEEC project
Visa övriga...
2024 (Engelska)Rapport (Övrigt vetenskapligt)
Abstract [en]

In this document, the EuroHPC JU Center of Excellence in Exascale CFD (CEEC) aims to provide users/ application developers with a brief overview of possibilities, limitations, and best practices for measuring energy consumption on European HPC systems. CEEC is working  to reduce the energy footprint of its consortium codes on such systems by applying novel algorithmic solutions. However, in initially exploring options for collecting energy measurements on both local and European HPC systems, we found no single approach for energy measurements and the process of taking these measurements comparatively more difficult than measuring time-to-solution with e.g. basic start-end time calls. This difficulty often stems from a requirement for privileged access to specific hardware counters. Mitigation strategies for this restriction exist and enable users to collect the energy metric, but they are not widely known. We describe these strategies followed by concrete examples from CEEC on how to harvest the energy measurements. We believe this will help to increase awareness and thus utilization of energy consumption measurements in the application development process.

Furthermore, we describe several other important issues: 1) granularity and overhead of measurements since energy=power x time and 2) what is included (there multiple factors) in the number delivered by a tool/ framework/ workload manager. We strive to be concise and precise aiming to provide a glimpse of energy measurement methods as well as many references for further exploration. Our takeaway messages are

  • The community/ data centers need to facilitate energy measurements on the European HPC systems and teach the community how to conduct such measurements.
  • The community/ data centers need to provide transparent and easy-to-use guides on each (at least large) European HPC system, outlining the ways to collect energy measurements.

In CEEC, we are taking the first steps towards spreading these messages, aiming to create a larger consortium including experts and data centers, who can contribute to and update this document. Explore and stay tuned!

Ort, förlag, år, upplaga, sidor
The CEEC Consortium Partners, 2024. s. 22
Nationell ämneskategori
Beräkningsmatematik Programvaruteknik
Forskningsämne
datalogi; matematik
Identifikatorer
urn:nbn:se:umu:diva-228733 (URN)10.5281/zenodo.13306639 (DOI)
Forskningsfinansiär
EU, Horisont Europa, 101093393
Tillgänglig från: 2024-08-21 Skapad: 2024-08-21 Senast uppdaterad: 2024-08-22Bibliografiskt granskad
Organisationer
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0009-0003-5512-254X

Sök vidare i DiVA

Visa alla publikationer