Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
...
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com >
2025-07-10 23:23:27 +00:00
Srangrang
9f13b2c6ac
style: modify HALF to BFLOAT16 in benchmark folder
2025-06-15 20:57:05 +08:00
gkdddd
670ec6f757
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B
...
Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0
Related to issue #5279
Co-authored-by Linjin Li <linjin_li@163.com >
2025-06-03 20:14:30 +08:00
Qiyu8
dd6ebdfdab
Refactor the performance measurement system
2020-10-23 10:32:03 +08:00
Martin Kroeker
7ae9e8960e
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:08:29 +02:00
Martin Kroeker
5464eb13ea
Change ifdef linux to __linux for C11 compatibility
2020-09-30 22:59:41 +02:00
Rajalakshmi Srinivasaraghavan
ce90e2bd3f
Include shgemm in benchtest
...
This patch is to enable benchtest for half precision gemm
when BUILD_HALF is set during make.
2020-05-11 09:57:46 -05:00
AbdelRauf
a469b32cf4
sgemm pipeline improved, zgemm rewritten without inner packs, ABI lxvx v20 fixed with vs52
2019-06-04 07:11:30 +00:00
Martin Kroeker
35c5a32309
Correct index variables used in MFlops calculation
...
Fixes #1474
2018-03-27 21:52:29 +02:00
Tim Moon
a89d6711c6
Increasing flexibility of GEMM benchmark.
...
m, n, and k can be set to arbitrary constants. A and B matrices can be transposed independently.
2017-09-28 12:56:29 -07:00
Ashwin Sekhar T K
67874468a6
Fix bug in benchmark/gemm.c
2015-11-09 14:15:54 +05:30
Werner Saar
e19948baa1
small modification of gemm.c
2015-06-03 09:11:51 +02:00
Werner Saar
887aed634d
modified sources for OS Darwin
2014-12-19 12:40:46 +01:00
Werner Saar
1e566223ed
added code for the size of n
2014-12-17 15:02:11 +01:00
wernsaar
29125864b3
updated gemm.c
2014-08-23 17:28:01 +02:00
wernsaar
1d4ffddf69
added conf option for number of loops
2014-07-12 11:54:39 +02:00
wernsaar
e27433ab6a
added gemm benchmark and modified Makefile for benchmark
2014-07-11 11:09:47 +02:00