Chris Sidebottom
2c3cdaf74e
Optimized BGEMV for NEOVERSEV1 target
...
- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels
2025-07-23 10:51:41 +01:00
Chris Sidebottom
740efd71c4
Add optimized BGEMM kernel for NEOVERSEV1 target
...
This also improves the testing and generic kernel by re-using the BF16
conversion functions.
Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com >
2025-07-10 23:23:27 +00:00
Srangrang
9f13b2c6ac
style: modify HALF to BFLOAT16 in benchmark folder
2025-06-15 20:57:05 +08:00
gxw
ffaa5765a4
Bench: Add omatcopy
2024-10-18 11:07:52 +08:00
Sergei Lewis
3ffd6868d7
Merge branch 'develop' into dev/slewis/merge-from-riscv
2024-02-01 11:29:41 +00:00
gxw
3d4dfd0085
Benchmark: Rename the executable file names for {sc/dz}a{min/max}
...
No interface named {c/z}a{min/max}, keeping it would
cause ambiguity
2024-01-30 11:33:01 +08:00
HellerZheng
943372bdf5
Merge branch 'develop' into develop
2022-11-18 10:12:46 +08:00
Martin Kroeker
f92dd6e303
change line endings from CRLF to LF
2022-11-17 10:18:36 +01:00
Heller Zheng
bef47917bd
Initial version for riscv sifive x280
2022-11-15 00:06:25 -08:00
Martin Kroeker
7ae9e8960e
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:08:29 +02:00
Martin Kroeker
ced49466f0
Use the fortran compiler to link LAPACK-related benchmarks
...
to fix linking problems with (at least) the AMD version of flang that creates dependencies on more than just the fortran runtime.
2020-05-29 13:35:51 +02:00
Rajalakshmi Srinivasaraghavan
ce90e2bd3f
Include shgemm in benchtest
...
This patch is to enable benchtest for half precision gemm
when BUILD_HALF is set during make.
2020-05-11 09:57:46 -05:00
Martin Kroeker
717c604aeb
Merge pull request #2515 from zelong-1024/develop
...
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
2020-03-16 21:59:55 +01:00
Martin Kroeker
ce33da4cab
Merge pull request #2513 from aaawuanjun/develop
...
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
2020-03-16 21:58:55 +01:00
l00536773
d45c53ecf1
[OpenBLAS]: benchmark for her/her2 LEVEL2 functions
...
[description]: benchmark for her/her2
[solution]: added benchmark for her/her2, modified makefile in benchmark
[dts]:
2020-03-16 11:19:05 +08:00
Martin Kroeker
c0649aa694
Merge pull request #2506 from xiaofengF/develop
...
Add benchmark for SPMV and fix segmentation fault when data size >= 50000
2020-03-14 13:08:36 +01:00
wuanjun 00447568
2428dc9fd3
[OpenBlas]: Add benchmark tpsv file and modify benchmark/Makefile
...
[Description]: Solve lack of tpsv benchmark.
2020-03-14 09:11:08 +08:00
jayfely@qq.com
ae3f2c2e49
Remove cspmv and zspmv to remove the error occured in travis CI
2020-03-11 17:02:34 +08:00
jayfely@qq.com
649733ff15
Only keep spmv.goto and spmv.atlas
2020-03-11 15:48:58 +08:00
wuanjun 00447568
3e8f1c6cc5
[OpenBlas]:Add benchmark tpmv.c and modify Makefile
...
[Description]:Solve the problem of missing tpmv.c benchmark file
2020-03-11 12:31:48 +08:00
jayfely@qq.com
08e1d8cbae
Modify Makefile in Benchmark
2020-03-10 14:32:18 +08:00
jayfely@qq.com
ff40a4e726
Add benchmark for SPMV
2020-03-10 14:22:18 +08:00
s00548429
c5bdd21352
Add benchmark for ?amax, ?max, ?amin, ?min, i?max, i?amin and i?min.
2020-03-09 14:59:03 +08:00
Martin Kroeker
b6a6ccbbea
Merge pull request #2495 from ZuoQ3/develop
...
add benchmark for axpby test
2020-03-08 08:09:58 +01:00
zq
0c8162eba6
Add benchmark file axpby.c and modify benchmark/Makefile to test s/d/c/zaxpby
2020-03-07 17:48:55 +08:00
shengyang
09c7a191bd
add benchmark for csrot and zdrot
...
modified: benchmark/Makefile
modified: benchmark/rot.c
2020-03-07 15:17:49 +08:00
Martin Kroeker
dca3e0cf20
Merge pull request #2491 from chenxuqiang/hbmv_benchmark
...
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
2020-03-06 15:06:42 +01:00
Martin Kroeker
c9f8db979b
Merge pull request #2490 from shengyang-3390/develop
...
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
2020-03-06 15:05:55 +01:00
Martin Kroeker
97c36ca58c
Merge branch 'develop' into develop
2020-03-06 14:41:40 +01:00
Martin Kroeker
9f5a74f3c7
Merge pull request #2486 from qqqil/develop
...
add benchmark for trsv
2020-03-06 14:30:09 +01:00
Martin Kroeker
2afb10975d
Merge pull request #2485 from Darkness303/develop
...
Add syr2 benchmark
2020-03-06 14:29:27 +01:00
chenxuqiang
32c847df45
benchmark/hpmv&hbmv: add benchmark/hpmv.c and benchmark/hbmv.c
...
Signed-off-by: Xuqiang Chen chenxuqiang3@hisilicon.com
2020-03-06 01:02:02 -05:00
shengyang
e0df9485d4
Add benchmark file rotm.c and modify benchmark/Makefile to test s/drotm
...
modified: benchmark/Makefile
new file: benchmark/rotm.c
2020-03-05 10:05:59 +08:00
s00527847
0f1a2b12f9
add benchmark for spr/spr2
2020-03-04 15:50:19 -05:00
q00437336
de74e11641
add benchmark for trsv
2020-03-04 03:23:22 -05:00
Darkness303
114dbec947
1.Add syr2 benchmark
...
2.Fixed some errors
2020-03-04 14:09:10 +08:00
wuanjun 00447568
f682d19ed4
[OpenBlas]: add benchmark file trmv.c and modify benchmark/Makefile to test s/d/c/ztrmv
2020-03-03 17:37:33 +08:00
j00520245
e1062400c4
New add syr benchmark
2020-02-28 16:36:53 +08:00
Ashwin Sekhar T K
1530e78cfe
Benchmarks: Avoid building lapack benchmarks when NO_LAPACK=1
2017-01-24 20:50:23 -08:00
Ashwin Sekhar T K
925d4e1dc6
Add IAMAX and NRM2 benchmarks
2016-07-14 13:46:01 +05:30
Werner Saar
318cad9c37
added trsm bencharks for POWER8 to benchmark/Makefile
2016-05-22 13:51:47 +02:00
Werner Saar
dd2b897795
added bugfixes for some make files and smallscaling.c
2016-04-21 12:54:32 +02:00
Werner Saar
1ca750471a
added cholesky benchmarks to Makefile for ESSL
2016-04-10 11:28:20 +02:00
Werner Saar
08bddde3f3
updated benchmark Makefile for ESSL
2016-04-08 10:37:59 +02:00
Werner Saar
12540cedb5
added ESSL to Makefile for benchmarks
2016-04-03 07:21:48 +02:00
Werner Saar
7a92c1538e
added benchmark test for srot and drot
2016-03-26 07:14:13 +01:00
Jerome Robert
323c237e7b
Fix smallscaling compilation
...
Also revert 0bbca5e
2016-03-10 20:24:41 +01:00
Werner Saar
0bbca5e803
removed build of smallscaling, because build on arm, arm64 and power fails
2016-03-06 11:54:41 +01:00
Jerome Robert
73397faf68
Add benchmark/smallscaling.c
...
* Bench small matrices with multi-threading
* Close #727
2016-02-08 11:25:27 +01:00
Werner Saar
6a13a94e71
added gesv benchmark
2015-06-02 13:35:49 +02:00