Research output: Contribution to journal › Article › peer-review
Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Analytical modeling of matrix–vector multiplication on multicore processors
AU - Gareev, Roman A.
AU - Akimova, Elena N.
PY - 2022/10/1
Y1 - 2022/10/1
N2 - The efficiency of matrix–vector multiplication is of considerable importance. No current approaches can optimize this sufficiently well under severe time constraints. All major existing methods are based on either manual-tuning or auto-tuning and can therefore be time-consuming. We introduce an alternative model-driven approach, which is used to map the implementation of matrix–vector multiplication to a target architecture and analytically obtain its parameters. The approach yields the performance that is competitive with optimized Basic Linear Algebra Subprograms (BLAS)-like dense linear algebra libraries without the need for manual-tuning or auto-tuning. Our method provides competitive performance across hardware architectures and can be utilized to obtain single-threaded and multi-threaded implementations on multicore processors. We expect that this approach allows the community to progress from valuable engineering solutions to techniques with a broader application. © 2021 John Wiley & Sons, Ltd.
AB - The efficiency of matrix–vector multiplication is of considerable importance. No current approaches can optimize this sufficiently well under severe time constraints. All major existing methods are based on either manual-tuning or auto-tuning and can therefore be time-consuming. We introduce an alternative model-driven approach, which is used to map the implementation of matrix–vector multiplication to a target architecture and analytically obtain its parameters. The approach yields the performance that is competitive with optimized Basic Linear Algebra Subprograms (BLAS)-like dense linear algebra libraries without the need for manual-tuning or auto-tuning. Our method provides competitive performance across hardware architectures and can be utilized to obtain single-threaded and multi-threaded implementations on multicore processors. We expect that this approach allows the community to progress from valuable engineering solutions to techniques with a broader application. © 2021 John Wiley & Sons, Ltd.
UR - https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=tsmetrics&SrcApp=tsm_test&DestApp=WOS_CPL&DestLinkType=FullRecord&KeyUT=000607457400001
UR - http://www.scopus.com/inward/record.url?partnerID=8YFLogxK&scp=85099368206
U2 - 10.1002/mma.7045
DO - 10.1002/mma.7045
M3 - Article
VL - 45
SP - 8769
EP - 8799
JO - Mathematical Methods in the Applied Sciences
JF - Mathematical Methods in the Applied Sciences
SN - 0170-4214
IS - 15
ER -
ID: 30834578