umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Fine-Grained Bulge-Chasing Kernels for Strongly Scalable Parallel QR Algorithms
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Högpresterande beräkningscentrum norr (HPC2N).ORCID-id: 0000-0002-4675-7434
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Högpresterande beräkningscentrum norr (HPC2N).
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för datavetenskap. Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Högpresterande beräkningscentrum norr (HPC2N).
2014 (engelsk)Inngår i: Parallel Computing, ISSN 0167-8191, E-ISSN 1872-7336, nr 7, s. 271-288Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

The bulge-chasing kernel in the small-bulge multi-shift QR algorithm for the non-symmetric dense eigenvalue problem becomes a sequential bottleneck when the QR algorithm is run in parallel on a multicore platform with shared memory. The duration of each kernel invocation is short, but the critical path of the QR algorithm contains a long sequence of calls to the bulge-chasing kernel. We study the problem of parallelizing the bulge-chasing kernel itself across a handful of processor cores in order to reduce the execution time of the critical path. We propose and evaluate a sequence of four algorithms with varying degrees of complexity and verify that a pipelined algorithm with a slowly shifting block column distribution of the Hessenberg matrix is superior. The load-balancing problem is non-trivial and computational experiments show that the load-balancing scheme has a large impact on the overall performance. We propose two heuristics for the load-balancing problem and also an effective optimization method based on local search. Numerical experiments show that speed-ups are obtained for problems as small as 40-by-40 on two different multicore architectures.

sted, utgiver, år, opplag, sider
Elsevier, 2014. nr 7, s. 271-288
Emneord [en]
Fine-grained parallelism, Scalability, Load-balancing, Load-balance optimization, Auto-tuning
HSV kategori
Forskningsprogram
administrativ databehandling
Identifikatorer
URN: urn:nbn:se:umu:diva-79742DOI: 10.1016/j.parco.2014.04.003ISI: 000339598400010OAI: oai:DiVA.org:umu-79742DiVA, id: diva2:644471
Konferanse
7th International Workshop on Parallel Matrix Algorithms and Applications, London, June 28-30, 2012
Merknad

Volume: 40 Issue: 7 Pages: 271-288 Special Issue: SI

Tilgjengelig fra: 2013-08-30 Laget: 2013-08-30 Sist oppdatert: 2018-06-08bibliografisk kontrollert

Open Access i DiVA

PARCO-D-12-00193.pdf(703 kB)330 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 703 kBChecksum SHA-512
75e219e892de22965b1cde4fc7605e62992118db65cd2dd2cee87a0b057ffb6776391474e8a5549019a395330d30c736ca627a468a57addc8a6a56b7f86bffff
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekst

Personposter BETA

Karlsson, LarsKågström, BoWadbro, Eddie

Søk i DiVA

Av forfatter/redaktør
Karlsson, LarsKågström, BoWadbro, Eddie
Av organisasjonen
I samme tidsskrift
Parallel Computing

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 330 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 448 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf