abhishek-fujitsu
|
720a4743b9
|
update contribution list
|
2025-07-25 11:08:22 +05:30 |
|
Martin Kroeker
|
0e11537cab
|
Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
|
2025-07-09 09:34:58 +02:00 |
|
Martin Kroeker
|
fd37406817
|
Merge branch 'develop' into optimized_gemv_n_1x3
|
2025-07-08 21:05:30 +02:00 |
|
Chris Sidebottom
|
f95e7b0e32
|
Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com>
|
2025-07-08 16:22:41 +01:00 |
|
guoyuanplct
|
4ff549a450
|
Update CONTRIBUTORS.md
|
2025-07-08 17:16:51 +08:00 |
|
guoyuanplct
|
309c48e327
|
Update CONTRIBUTORS.md
|
2025-07-08 17:13:27 +08:00 |
|
Sharif Inamdar
|
8279e68805
|
Optimize gemv_n_sve_v1x3 kernel
- Calculate predicate outside the loop
- Divide matrix in blocks of 3
|
2025-06-11 10:16:56 +00:00 |
|
abhishek-fujitsu
|
0c239c9d48
|
update contribution list
|
2025-04-23 22:35:05 +05:30 |
|
Annop Wongwathanarat
|
ec146157d3
|
Use SVE kernel for S/DGEMVT for SVE machines
|
2025-04-09 20:38:14 +00:00 |
|
Annop Wongwathanarat
|
9807f56580
|
Optimize aarch64 sgemm_ncopy
|
2025-03-13 10:17:43 +00:00 |
|
Annop Wongwathanarat
|
a085b6c9ec
|
Fix aarch64 sbgemv_t compilation error for GCC < 13
|
2025-03-12 14:52:42 +00:00 |
|
Martin Kroeker
|
2b941c44b5
|
Merge branch 'develop' into sbgemv_n_neon
|
2025-03-02 22:39:32 +01:00 |
|
Ye Tao
|
35bdbca153
|
Add sbgemv_n_neon kernel for arm64.
|
2025-02-28 14:37:06 +00:00 |
|
Annop Wongwathanarat
|
edaf51dd99
|
Add sbgemv_t_bfdot kernel for ARM64
This improves performance for sbgemv_t by up to 100x on NEOVERSEV1.
The geometric mean speedup is ~61x for M=N=[2,512].
|
2025-02-28 12:31:50 +00:00 |
|
Marek Michalowski
|
650a062e19
|
Add thread throttling profile for SGEMV on NEOVERSEV2
|
2025-02-20 10:28:31 +00:00 |
|
Marek Michalowski
|
b723c1b7b7
|
Add thread throttling profile for SGEMM on NEOVERSEV2
|
2025-02-20 10:28:21 +00:00 |
|
Ye Tao
|
c748e6a338
|
optimized sbgemm kernel for neoverse-v1 (sve-256)
Signed-off-by: Ye Tao <ye.tao@arm.com>
|
2025-02-05 10:06:37 +00:00 |
|
Martin Kroeker
|
6e393a5599
|
Merge branch 'develop' into gemv_t
|
2025-01-25 12:54:04 +01:00 |
|
Marek Michalowski
|
838bb57e27
|
Merge branch 'develop' into develop
|
2025-01-24 14:19:35 +00:00 |
|
Marek Michalowski
|
4d5b13f765
|
Add thread throttling profile for SGEMV on NEOVERSEV1
|
2025-01-22 10:50:04 +00:00 |
|
Annop Wongwathanarat
|
c0318cea6e
|
Simplify gemv_t_sve_v1x3 kernel
|
2025-01-21 13:40:17 +00:00 |
|
Annop Wongwathanarat
|
c8cd8da496
|
Add thread throttling profile for SGEMM on NEOVERSEV1
|
2025-01-13 15:43:08 +00:00 |
|
CDAC-SSDG
|
41912f9c22
|
Update CONTRIBUTORS.md
|
2024-12-13 11:05:10 +05:30 |
|
CDAC-SSDG
|
2718b37fed
|
Update CONTRIBUTORS.md
|
2024-10-30 13:57:13 +05:30 |
|
Chris Daley
|
cb48505251
|
optimize gemv forwarding on ARM64 systems
|
2024-10-24 21:05:26 -07:00 |
|
Jake Arkinstall
|
44004178aa
|
Updated CONTRIBUTORS.md
As requested on X (https://x.com/KroekerMartin/status/1755218919290278185)
|
2024-06-01 11:22:26 +01:00 |
|
Mark Seminatore
|
b29fd48998
|
Merge branch 'develop' into win_tidy
|
2024-02-12 10:23:17 -08:00 |
|
Mark Seminatore
|
10548a0460
|
update contributors
|
2024-02-12 10:22:12 -08:00 |
|
Dirreke
|
ec89466e14
|
Add CSKY support
|
2024-01-16 23:45:06 +08:00 |
|
Mark Seminatore
|
5f51811728
|
try at new threading model
|
2023-12-05 22:43:36 -08:00 |
|
Martin Kroeker
|
616fdea82a
|
Revert "Improve Windows threading performance scaling"
|
2023-06-28 09:45:17 +02:00 |
|
Mark Seminatore
|
427f9f2428
|
update contributors
|
2023-06-23 22:15:39 -07:00 |
|
Chris Sidebottom
|
bfc20c2e97
|
Add Chris Sidebottom to CONTRIBUTORS.md
|
2023-04-17 11:53:31 +01:00 |
|
Pablo Romero
|
1b1f781cf9
|
Added name and details to contributors' list.
|
2022-08-26 11:45:23 +02:00 |
|
Xianyi Zhang
|
f9715605ac
|
Add PLCT to contributors.
|
2022-06-06 14:11:28 +08:00 |
|
Martin Kroeker
|
5d24f3d210
|
Update CONTRIBUTORS.md
|
2022-01-22 19:09:00 +01:00 |
|
Martin Kroeker
|
66a15e15a8
|
Update CONTRIBUTORS.md
|
2022-01-22 19:02:57 +01:00 |
|
Bine Brank
|
19d435b1b3
|
update armv8sve + contributors
|
2022-01-18 08:28:31 +01:00 |
|
Bine Brank
|
cbcea149f0
|
update contributors
|
2022-01-06 10:29:35 +01:00 |
|
Bine Brank
|
ca65a4e91d
|
update CONTRIBUTORS.md
|
2021-11-26 13:11:19 +01:00 |
|
River Dillon
|
ddb6cee0d5
|
Contribution note
|
2021-07-10 01:34:47 -07:00 |
|
Xianyi Zhang
|
7834c10e2f
|
Add PingTouGe contribution credit.
|
2020-12-07 16:55:05 +08:00 |
|
Marius Hillenbrand
|
f7731a358a
|
Update CONTRIBUTERS.md - clang build fixes for IBM z
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
|
2020-09-08 19:34:18 +02:00 |
|
张丹枫
|
2a3aa91354
|
update CONTRIBUTORS.md, adding myself
|
2020-05-20 22:35:26 +08:00 |
|
Marius Hillenbrand
|
cb9dc36dd5
|
Update CONTRIBUTORS.md
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
|
2020-05-12 16:14:00 +02:00 |
|
Marius Hillenbrand
|
d7c1677c20
|
Update CONTRIBUTORS.md, adding myself
Signed-off-by: Marius Hillenbrand <mhillen@linux.ibm.com>
|
2020-05-12 11:09:28 +02:00 |
|
Martin Kroeker
|
3e28db7f38
|
Update CONTRIBUTORS.md
|
2020-04-25 13:51:44 +02:00 |
|
wjc404
|
9f5cdc49d4
|
Update CONTRIBUTORS.md
|
2020-01-06 12:28:43 +08:00 |
|
wjc404
|
bb2729c855
|
Update CONTRIBUTORS.md
|
2019-12-30 16:11:37 +08:00 |
|
wjc404
|
aae44d040d
|
Update CONTRIBUTORS.md
|
2019-12-30 16:10:08 +08:00 |
|