The efficiency of matrix–vector multiplication is of considerable importance. No current approaches can optimize this sufficiently well under severe time constraints. All major existing methods are based on either manual-tuning or auto-tuning and can therefore be time-consuming. We introduce an alternative model-driven approach, which is used to map the implementation of matrix–vector multiplication to a target architecture and analytically obtain its parameters. The approach yields the performance that is competitive with optimized Basic Linear Algebra Subprograms (BLAS)-like dense linear algebra libraries without the need for manual-tuning or auto-tuning. Our method provides competitive performance across hardware architectures and can be utilized to obtain single-threaded and multi-threaded implementations on multicore processors. We expect that this approach allows the community to progress from valuable engineering solutions to techniques with a broader application. © 2021 John Wiley & Sons, Ltd.
Original languageEnglish
Pages (from-to)8769-8799
Number of pages31
JournalMathematical Methods in the Applied Sciences
Volume45
Issue number15
DOIs
Publication statusPublished - 1 Oct 2022

    WoS ResearchAreas Categories

  • Mathematics

    ASJC Scopus subject areas

  • General Mathematics
  • General Engineering

ID: 30834578