Umeå University's logo

umu.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (5 of 5) Show all publications
Chen, Y., Castro, P. d., Bientinesi, P. & Iakymchuk, R. (2025). Enabling mixed-precision with the help of tools: a nekbone case study. In: Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski (Ed.), Parallel processing and applied mathematics: 15Th International Conference, Ppam 2024, Ostrava, Czech Republic, September 8–11, 2024, Revised Selected Papers, Part I. Paper presented at 15th International Conference on Parallel Processing and Applied Mathematics, PPAM 2024, Ostrava, Czech Republic, September 8–11, 2024 (pp. 34-50). Cham: Springer Nature
Open this publication in new window or tab >>Enabling mixed-precision with the help of tools: a nekbone case study
2025 (English)In: Parallel processing and applied mathematics: 15Th International Conference, Ppam 2024, Ostrava, Czech Republic, September 8–11, 2024, Revised Selected Papers, Part I / [ed] Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, Konrad Karczewski, Cham: Springer Nature, 2025, p. 34-50Conference paper, Published paper (Refereed)
Abstract [en]

Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we consider Nekbone, a mini-application for the Computational Fluid Dynamics (CFD) solver Nek5000, as a case study, and propose a methodology for enabling mixed-precision with the help of computer arithmetic tools and roofline model. We evaluate the derived mixed-precision program by combining metrics in three dimensions: accuracy, time-to-solution, and energy-to-solution. Notably, the introduction of mixed-precision in Nekbone, reducing time-to-solution by 40.7% and energy-to-solution by 47% on 128 MPI ranks without sacrificing the accuracy.

Place, publisher, year, edition, pages
Cham: Springer Nature, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15579
Keywords
computer arithmetic tool, Conjugate Gradient, energy-to-solution, Mixed-precision, Nekbone, roofline model, Verificarlo
National Category
Computer Sciences Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-238100 (URN)10.1007/978-3-031-85697-6_3 (DOI)2-s2.0-105002711656 (Scopus ID)9783031856969 (ISBN)
Conference
15th International Conference on Parallel Processing and Applied Mathematics, PPAM 2024, Ostrava, Czech Republic, September 8–11, 2024
Available from: 2025-05-05 Created: 2025-05-05 Last updated: 2025-05-05Bibliographically approved
Gedik, G., Kulkarni, K., Chen, Y., Kempf, D., Kemmler, S., Papageorgiou, D., . . . Iakymchuk, R. (2024). Best practice guide – harvesting energy consumption on european HPC systems: sharing experience from the CEEC project. The CEEC Consortium Partners
Open this publication in new window or tab >>Best practice guide – harvesting energy consumption on european HPC systems: sharing experience from the CEEC project
Show others...
2024 (English)Report (Other academic)
Abstract [en]

In this document, the EuroHPC JU Center of Excellence in Exascale CFD (CEEC) aims to provide users/ application developers with a brief overview of possibilities, limitations, and best practices for measuring energy consumption on European HPC systems. CEEC is working  to reduce the energy footprint of its consortium codes on such systems by applying novel algorithmic solutions. However, in initially exploring options for collecting energy measurements on both local and European HPC systems, we found no single approach for energy measurements and the process of taking these measurements comparatively more difficult than measuring time-to-solution with e.g. basic start-end time calls. This difficulty often stems from a requirement for privileged access to specific hardware counters. Mitigation strategies for this restriction exist and enable users to collect the energy metric, but they are not widely known. We describe these strategies followed by concrete examples from CEEC on how to harvest the energy measurements. We believe this will help to increase awareness and thus utilization of energy consumption measurements in the application development process.

Furthermore, we describe several other important issues: 1) granularity and overhead of measurements since energy=power x time and 2) what is included (there multiple factors) in the number delivered by a tool/ framework/ workload manager. We strive to be concise and precise aiming to provide a glimpse of energy measurement methods as well as many references for further exploration. Our takeaway messages are

  • The community/ data centers need to facilitate energy measurements on the European HPC systems and teach the community how to conduct such measurements.
  • The community/ data centers need to provide transparent and easy-to-use guides on each (at least large) European HPC system, outlining the ways to collect energy measurements.

In CEEC, we are taking the first steps towards spreading these messages, aiming to create a larger consortium including experts and data centers, who can contribute to and update this document. Explore and stay tuned!

Place, publisher, year, edition, pages
The CEEC Consortium Partners, 2024. p. 22
National Category
Computational Mathematics Software Engineering
Research subject
Computer Science; Mathematics
Identifiers
urn:nbn:se:umu:diva-228733 (URN)10.5281/zenodo.13306639 (DOI)
Funder
EU, Horizon Europe, 101093393
Available from: 2024-08-21 Created: 2024-08-21 Last updated: 2024-08-22Bibliographically approved
Iakymchuk, R., Graillat, S. & Aliaga, J. I. (2024). General framework for re-assuring numerical reliability in parallel Krylov solvers: a case of bi-conjugate gradient stabilized methods. The international journal of high performance computing applications, 38(1), 17-33
Open this publication in new window or tab >>General framework for re-assuring numerical reliability in parallel Krylov solvers: a case of bi-conjugate gradient stabilized methods
2024 (English)In: The international journal of high performance computing applications, ISSN 1094-3420, E-ISSN 1741-2846, Vol. 38, no 1, p. 17-33Article in journal (Refereed) Published
Abstract [en]

Parallel implementations of Krylov subspace methods often help to accelerate the procedure of finding an approximate solution of a linear system. However, such parallelization coupled with asynchronous and out-of-order execution often makes more visible the non-associativity impact in floating-point operations. These problems are even amplified when communication-hiding pipelined algorithms are used to improve the parallelization of Krylov subspace methods. Introducing reproducibility in the implementations avoids these problems by getting more robust and correct solutions. This paper proposes a general framework for deriving reproducible and accurate variants of Krylov subspace methods. The proposed algorithmic strategies are reinforced by programmability suggestions to assure deterministic and accurate executions. The framework is illustrated on the preconditioned BiCGStab method and its pipelined modification, which in fact is a distinctive method from the Krylov subspace family, for the solution of non-symmetric linear systems with message-passing. Finally, we verify the numerical behavior of the two reproducible variants of BiCGStab on a set of matrices from the SuiteSparse Matrix Collection and a 3D Poisson’s equation.

Place, publisher, year, edition, pages
Sage Publications, 2024
Keywords
accuracy, ExBLAS, HPC, Numerical reliability, PBiCGStab, pipelined PBiCGStab, reproducibility
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-216137 (URN)10.1177/10943420231207642 (DOI)001087250200001 ()2-s2.0-85174938956 (Scopus ID)
Available from: 2023-11-02 Created: 2023-11-02 Last updated: 2025-04-24Bibliographically approved
Havdiak, M., Aliaga, J. I. & Iakymchuk, R. (2024). Robustness and accuracy in pipelined Bi-Conjugate Gradient Stabilized methods. In: Leonardo Franco; Clélia de Mulatier; Maciej Paszynski; Valeria V. Krzhizhanovskaya; Jack J. Dongarra; Peter M. A. Sloot (Ed.), Computational science – ICCS 2024: 24th International Conference, Malaga, Spain, July 2–4, 2024, Proceedings, Part III. Paper presented at 24th International Conference on Computational Science, ICCS 2024, Malaga, Spain, July 2–4, 2024 (pp. 311-319). Springer
Open this publication in new window or tab >>Robustness and accuracy in pipelined Bi-Conjugate Gradient Stabilized methods
2024 (English)In: Computational science – ICCS 2024: 24th International Conference, Malaga, Spain, July 2–4, 2024, Proceedings, Part III / [ed] Leonardo Franco; Clélia de Mulatier; Maciej Paszynski; Valeria V. Krzhizhanovskaya; Jack J. Dongarra; Peter M. A. Sloot, Springer, 2024, p. 311-319Conference paper, Published paper (Refereed)
Abstract [en]

In this article, we propose an accuracy-assuring technique for finding a solution for unsymmetric linear systems. Such problems are related to different areas such as image processing, computer vision, and computational fluid dynamics. Parallel implementation of Krylov subspace methods speeds up finding approximate solutions for linear systems. In this context, the refined approach in pipelined BiCGStab enhances scalability on distributed memory machines, yielding to substantial speed improvements compared to the standard BiCGStab method. However, it’s worth noting that the pipelined BiCGStab algorithm sacrifices some accuracy, which is stabilized with the residual replacement technique. This paper aims to address this issue by employing the ExBLAS-based reproducible approach. We validate the idea on a set of matrices from the SuiteSparse Matrix Collection.

Place, publisher, year, edition, pages
Springer, 2024
Series
Lecture notes in computer science, ISSN 0302-9743, E-ISSN 1611-3349 ; 14834
Keywords
BiCGStab, ExBLAS, HPC, Krylov subspace methods, Numerical reliability, Residual replacement
National Category
Computational Mathematics
Identifiers
urn:nbn:se:umu:diva-228515 (URN)10.1007/978-3-031-63759-9_35 (DOI)001279325500035 ()2-s2.0-85199660458 (Scopus ID)9783031637582 (ISBN)9783031637599 (ISBN)
Conference
24th International Conference on Computational Science, ICCS 2024, Malaga, Spain, July 2–4, 2024
Available from: 2024-08-20 Created: 2024-08-20 Last updated: 2025-04-24Bibliographically approved
Iakymchuk, R., Graillat, S. & Aliaga, J. I. (2023). General framework for deriving reproducible krylov subspace algorithms: BiCGStab case. In: Roman Wyrzykowski; Jack Dongarra; Ewa Deelman; Konrad Karczewski (Ed.), Parallel processing and applied mathematics: 14th International Conference, PPAM 2022, Gdansk, Poland, September 11–14, 2022, Revised Selected Papers, Part I. Paper presented at 14th International Conference on Parallel Processing and Applied Mathematics, PPAM 2022, September 11-14, 2022. (pp. 16-29). Springer Science+Business Media B.V.
Open this publication in new window or tab >>General framework for deriving reproducible krylov subspace algorithms: BiCGStab case
2023 (English)In: Parallel processing and applied mathematics: 14th International Conference, PPAM 2022, Gdansk, Poland, September 11–14, 2022, Revised Selected Papers, Part I / [ed] Roman Wyrzykowski; Jack Dongarra; Ewa Deelman; Konrad Karczewski, Springer Science+Business Media B.V., 2023, p. 16-29Conference paper, Published paper (Refereed)
Abstract [en]

Parallel implementations of Krylov subspace algorithms often help to accelerate the procedure to find the solution of a linear system. However, from the other side, such parallelization coupled with asynchronous and out-of-order execution often enlarge the non-associativity of floating-point operations. This results in non-reproducibility on the same or different settings. This paper proposes a general framework for deriving reproducible and accurate variants of a Krylov subspace algorithm. The proposed algorithmic strategies are reinforced by programmability suggestions to assure deterministic and accurate executions. The framework is illustrated on the preconditioned BiCGStab method for the solution of non-symmetric linear systems with message-passing. Finally, we verify the two reproducible variants of PBiCGStab on a set matrices from the SuiteSparse Matrix Collection and a 3D Poisson’s equation.

Place, publisher, year, edition, pages
Springer Science+Business Media B.V., 2023
Series
Lecture Notes in Computer Science, ISSN 03029743, E-ISSN 16113349 ; 13826
Keywords
accuracy, floating-point expansion, fused multiply-add, long accumulator, preconditioned BiCGStab, Reproducibility
National Category
Computational Mathematics Computer Sciences
Identifiers
urn:nbn:se:umu:diva-210209 (URN)10.1007/978-3-031-30442-2_2 (DOI)2-s2.0-85161362443 (Scopus ID)9783031304415 (ISBN)978-3-031-30442-2 (ISBN)
Conference
14th International Conference on Parallel Processing and Applied Mathematics, PPAM 2022, September 11-14, 2022.
Available from: 2023-06-28 Created: 2023-06-28 Last updated: 2023-06-28Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-2414-700X

Search in DiVA

Show all publications