Commit Graph

81 Commits

Author SHA1 Message Date
abhishek-fujitsu
720a4743b9 update contribution list 2025-07-25 11:08:22 +05:30
Martin Kroeker
0e11537cab Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
2025-07-09 09:34:58 +02:00
Martin Kroeker
fd37406817 Merge branch 'develop' into optimized_gemv_n_1x3 2025-07-08 21:05:30 +02:00
Chris Sidebottom
f95e7b0e32 Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.

Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287

Co-authored-by: Ye Tao <ye.tao@arm.com>
2025-07-08 16:22:41 +01:00
guoyuanplct
4ff549a450 Update CONTRIBUTORS.md 2025-07-08 17:16:51 +08:00
guoyuanplct
309c48e327 Update CONTRIBUTORS.md 2025-07-08 17:13:27 +08:00
Sharif Inamdar
8279e68805 Optimize gemv_n_sve_v1x3 kernel
- Calculate predicate outside the loop
- Divide matrix in blocks of 3
2025-06-11 10:16:56 +00:00
abhishek-fujitsu
0c239c9d48 update contribution list 2025-04-23 22:35:05 +05:30
Annop Wongwathanarat
ec146157d3 Use SVE kernel for S/DGEMVT for SVE machines 2025-04-09 20:38:14 +00:00
Annop Wongwathanarat
9807f56580 Optimize aarch64 sgemm_ncopy 2025-03-13 10:17:43 +00:00
Annop Wongwathanarat
a085b6c9ec Fix aarch64 sbgemv_t compilation error for GCC < 13 2025-03-12 14:52:42 +00:00
Martin Kroeker
2b941c44b5 Merge branch 'develop' into sbgemv_n_neon 2025-03-02 22:39:32 +01:00
Ye Tao
35bdbca153 Add sbgemv_n_neon kernel for arm64. 2025-02-28 14:37:06 +00:00
Annop Wongwathanarat
edaf51dd99 Add sbgemv_t_bfdot kernel for ARM64
This improves performance for sbgemv_t by up to 100x on NEOVERSEV1.
The geometric mean speedup is ~61x for M=N=[2,512].
2025-02-28 12:31:50 +00:00
Marek Michalowski
650a062e19 Add thread throttling profile for SGEMV on NEOVERSEV2 2025-02-20 10:28:31 +00:00
Marek Michalowski
b723c1b7b7 Add thread throttling profile for SGEMM on NEOVERSEV2 2025-02-20 10:28:21 +00:00
Ye Tao
c748e6a338 optimized sbgemm kernel for neoverse-v1 (sve-256)
Signed-off-by: Ye Tao <ye.tao@arm.com>
2025-02-05 10:06:37 +00:00
Martin Kroeker
6e393a5599 Merge branch 'develop' into gemv_t 2025-01-25 12:54:04 +01:00
Marek Michalowski
838bb57e27 Merge branch 'develop' into develop 2025-01-24 14:19:35 +00:00
Marek Michalowski
4d5b13f765 Add thread throttling profile for SGEMV on NEOVERSEV1 2025-01-22 10:50:04 +00:00
Annop Wongwathanarat
c0318cea6e Simplify gemv_t_sve_v1x3 kernel 2025-01-21 13:40:17 +00:00
Annop Wongwathanarat
c8cd8da496 Add thread throttling profile for SGEMM on NEOVERSEV1 2025-01-13 15:43:08 +00:00
CDAC-SSDG
41912f9c22 Update CONTRIBUTORS.md 2024-12-13 11:05:10 +05:30
CDAC-SSDG
2718b37fed Update CONTRIBUTORS.md 2024-10-30 13:57:13 +05:30
Chris Daley
cb48505251 optimize gemv forwarding on ARM64 systems 2024-10-24 21:05:26 -07:00
Jake Arkinstall
44004178aa Updated CONTRIBUTORS.md
As requested on X (https://x.com/KroekerMartin/status/1755218919290278185)
2024-06-01 11:22:26 +01:00
Mark Seminatore
b29fd48998 Merge branch 'develop' into win_tidy 2024-02-12 10:23:17 -08:00
Mark Seminatore
10548a0460 update contributors 2024-02-12 10:22:12 -08:00
Dirreke
ec89466e14 Add CSKY support 2024-01-16 23:45:06 +08:00
Mark Seminatore
5f51811728 try at new threading model 2023-12-05 22:43:36 -08:00
Martin Kroeker
616fdea82a Revert "Improve Windows threading performance scaling" 2023-06-28 09:45:17 +02:00
Mark Seminatore
427f9f2428 update contributors 2023-06-23 22:15:39 -07:00
Chris Sidebottom
bfc20c2e97 Add Chris Sidebottom to CONTRIBUTORS.md 2023-04-17 11:53:31 +01:00
Pablo Romero
1b1f781cf9 Added name and details to contributors' list. 2022-08-26 11:45:23 +02:00
Xianyi Zhang
f9715605ac Add PLCT to contributors. 2022-06-06 14:11:28 +08:00
Martin Kroeker
5d24f3d210 Update CONTRIBUTORS.md 2022-01-22 19:09:00 +01:00
Martin Kroeker
66a15e15a8 Update CONTRIBUTORS.md 2022-01-22 19:02:57 +01:00
Bine Brank
19d435b1b3 update armv8sve + contributors 2022-01-18 08:28:31 +01:00
Bine Brank
cbcea149f0 update contributors 2022-01-06 10:29:35 +01:00
Bine Brank
ca65a4e91d update CONTRIBUTORS.md 2021-11-26 13:11:19 +01:00
River Dillon
ddb6cee0d5 Contribution note 2021-07-10 01:34:47 -07:00
Xianyi Zhang
7834c10e2f Add PingTouGe contribution credit. 2020-12-07 16:55:05 +08:00
Marius Hillenbrand
f7731a358a Update CONTRIBUTERS.md - clang build fixes for IBM z
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-09-08 19:34:18 +02:00
张丹枫
2a3aa91354 update CONTRIBUTORS.md, adding myself 2020-05-20 22:35:26 +08:00
Marius Hillenbrand
cb9dc36dd5 Update CONTRIBUTORS.md
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-05-12 16:14:00 +02:00
Marius Hillenbrand
d7c1677c20 Update CONTRIBUTORS.md, adding myself
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
2020-05-12 11:09:28 +02:00
Martin Kroeker
3e28db7f38 Update CONTRIBUTORS.md 2020-04-25 13:51:44 +02:00
wjc404
9f5cdc49d4 Update CONTRIBUTORS.md 2020-01-06 12:28:43 +08:00
wjc404
bb2729c855 Update CONTRIBUTORS.md 2019-12-30 16:11:37 +08:00
wjc404
aae44d040d Update CONTRIBUTORS.md 2019-12-30 16:10:08 +08:00