Commit Graph

9463 Commits

Author SHA1 Message Date
Martin Kroeker
444d03db9c switch to another site that still has libffi6 (for now) 2025-07-23 14:04:11 +02:00
Martin Kroeker
2f81d6e60c Merge pull request #5390 from martin-frbg/issue5388-2
Declare the "small" complex DOT and AXPY kernels for RISCV-ZVL256B static in addition to inline
2025-07-22 13:05:14 +02:00
Martin Kroeker
e2d941e9af Declare the "small" kernel static in addition to inline 2025-07-22 11:02:32 +02:00
Martin Kroeker
8214700930 Declare the "small" kernel static in addition to inline 2025-07-22 11:01:37 +02:00
Martin Kroeker
4ae8707b54 Merge pull request #5389 from martin-frbg/issue5388
Add cross-compilation parameters for RISCV64 targets in CMake
2025-07-22 10:57:59 +02:00
Martin Kroeker
b24212f5df fix numbers 2025-07-21 22:54:52 +02:00
Martin Kroeker
6ff06f5483 Add cross-compilation data for RISCV64 targets 2025-07-21 22:42:15 +02:00
Martin Kroeker
d92f151634 Merge pull request #5386 from martin-frbg/issue5384
Fixes for some gcc warnings
2025-07-19 08:33:51 +02:00
Martin Kroeker
30dbca5051 fix misleading indentation to silence a gcc warning 2025-07-18 23:51:04 +02:00
Martin Kroeker
38e6999295 format cleanup 2025-07-18 23:45:08 +02:00
Martin Kroeker
3df503cafd portability fix and cleanup 2025-07-18 23:41:57 +02:00
Martin Kroeker
39c90f9859 Merge pull request #5380 from quic/topic/sgemm_direct_sme1_alpha_beta
SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3
2025-07-18 23:23:39 +02:00
Rajendra Prasad Matcha
eae0abfdb6 SME1 based direct kernel with alpha and beta for cblas_sgemm level 3 API. 2025-07-17 16:14:31 +05:30
Martin Kroeker
ac8cbfdd8e Merge pull request #5381 from Mousius/bgemv-infrastructure
Add infrastructure for BGEMV
2025-07-16 23:22:08 +02:00
Martin Kroeker
1742decdcb Merge pull request #5375 from lowkeyrossi/CI_for_WoA
Add CI support for building and validating OpenBLAS on WoA
2025-07-15 21:16:03 +02:00
Martin Kroeker
08df0f02d9 Merge pull request #5382 from martin-frbg/issue5379
Update cross-compilation instructions for the Android NDK
2025-07-15 21:07:34 +02:00
Martin Kroeker
7d7757acd1 Update cross-compilation instructions for the Android NDK 2025-07-15 18:25:55 +02:00
Chris Sidebottom
947d7af4c9 Fix CMake references to bscal and bgemv 2025-07-15 15:41:53 +01:00
Chris Sidebottom
72d2ebb4dd Re-add GEMV fallback for Level3 2025-07-15 15:00:20 +01:00
Chris Sidebottom
e105411460 Add infrastructure for bgemv/bscal
- Sets up all the various entrypoints for `bgemv`
- Adds `bscal` for use in the `bgemv` interface
- Adds test cases for comparing `sgemv` and `bgemv`
- Adds generic kernels for `bgemv_n` and `bgemv_t` which are accurate
enough to pass above tests
2025-07-15 14:48:57 +01:00
Martin Kroeker
666e1081ac Merge pull request #5378 from martin-frbg/cpuid_lunarlake
Add ID data for Intel Lunar Lake ("Core Ultra 200V series")
2025-07-13 23:18:22 +02:00
Martin Kroeker
3ea6322eff Merge pull request #5377 from Mousius/test-fixes
Improve bgemm and sbgemm testing
2025-07-13 23:03:35 +02:00
Martin Kroeker
848e9e6ba7 Add ID data for Intel Lunar Lake ("Core Ultra 200V series") 2025-07-13 20:34:19 +02:00
Chris Sidebottom
09a016fdf6 Split sbgemv test from sbgemm test 2025-07-13 13:39:44 +00:00
Chris Sidebottom
3f110c8272 Improve bgemm and sbgemm testing
- Fixes wrong return type for `is_close`
- Adds stricter compiler flags for test files so we don't see the above
issue again
- Re-uses test helper functions between compare_sgemm_sbgemm/bgemm.c
2025-07-13 12:48:09 +00:00
newyork_loki
cb2c726716 Add CI support for OpenBLAS on WoA 2025-07-12 14:37:30 +05:30
newyork_loki
c8d41e4a32 Add CI support for OpenBLAS on WoA 2025-07-12 14:34:29 +05:30
Martin Kroeker
81b30d4538 Merge pull request #5374 from martin-frbg/fixup-5373
Fix compilation of the new bgemm test
2025-07-11 15:33:38 +02:00
Martin Kroeker
aad97c7763 Fix return type declaration 2025-07-11 15:32:41 +02:00
Martin Kroeker
7acb122a98 Merge pull request #5373 from Mousius/bgemm-optimized
Add optimized BGEMM kernel for NEOVERSEV1 target
2025-07-11 11:56:56 +02:00
Chris Sidebottom
740efd71c4 Add optimized BGEMM kernel for NEOVERSEV1 target
This also improves the testing and generic kernel by re-using the BF16
conversion functions.

Built on top of https://github.com/OpenMathLib/OpenBLAS/pull/5357 and derived from https://github.com/OpenMathLib/OpenBLAS/pull/5287

Co-authored-by: Ye Tao <ye.tao@arm.com>
2025-07-10 23:23:27 +00:00
Martin Kroeker
e927373f62 Merge pull request #5371 from martin-frbg/fixup-5357
Complete the infrastructure changes for adding BGEMM
2025-07-10 16:38:37 +02:00
Martin Kroeker
9a272fece6 Re-enable the BGEMM tests 2025-07-10 15:02:59 +02:00
Martin Kroeker
b54aec804e remove spurious include 2025-07-10 15:00:30 +02:00
Martin Kroeker
343830c26f Add BGEMM parameter tables 2025-07-10 14:59:46 +02:00
Martin Kroeker
b37516add6 Add BGEMM parameters 2025-07-10 14:59:01 +02:00
Martin Kroeker
d030f81380 Merge pull request #5369 from martin-frbg/lapack1144
Fix workspace allocation in LAPACKE strsen/dtrsen (Reference-LAPACK PR 1144)
2025-07-10 10:46:15 +02:00
Martin Kroeker
b746f0eda3 Allocate IWORK to hold at least the one element for workspace queries 2025-07-10 08:58:16 +02:00
Martin Kroeker
b8f66ba0ee Merge pull request #5367 from Mousius/bgemm-init
Temporarily disable test_bgemm
2025-07-10 00:57:41 +02:00
Martin Kroeker
cdebb4fd4b Merge pull request #5365 from martin-frbg/issue5324
Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds using CMake
2025-07-09 22:50:54 +02:00
Martin Kroeker
ff614575c9 Fix arm64 HAVE_SME setting for DYNAMIC_ARCH builds 2025-07-09 14:44:25 +02:00
Martin Kroeker
0e11537cab Merge pull request #5357 from Mousius/bgemm-init
Add infrastructure for BGEMM
2025-07-09 09:34:58 +02:00
Chris Sidebottom
8cd4be8d47 Temporarily disable test_bgemm 2025-07-09 08:27:18 +01:00
Chris Sidebottom
66d9185ebe Fix CMake support 2025-07-08 22:49:55 +00:00
Martin Kroeker
98aefb70b4 Merge pull request #5292 from isharif168/optimized_gemv_n_1x3
Optimize gemv_n_sve_v1x3 kernel
2025-07-08 21:05:43 +02:00
Martin Kroeker
fd37406817 Merge branch 'develop' into optimized_gemv_n_1x3 2025-07-08 21:05:30 +02:00
Chris Sidebottom
48394384ef Use correct constants for per-target BGEMM/SBGEMM
This fixes the build and tests on `NEOVERSEV1` target, which was failing
with specific constants for `SBGEMM`

Co-authored-by: Ye Tao <ye.tao@arm.com>
2025-07-08 16:23:27 +01:00
Chris Sidebottom
73bf0b941a Add bgemm to gensymbol 2025-07-08 16:22:43 +01:00
Chris Sidebottom
f95e7b0e32 Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.

Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287

Co-authored-by: Ye Tao <ye.tao@arm.com>
2025-07-08 16:22:41 +01:00
Martin Kroeker
15d6e58510 Merge pull request #5364 from martin-frbg/blashalf
change BLAS_HALF to BLAS_BFLOAT16 in parallelized POTRF (another missed rename)
2025-07-08 17:14:50 +02:00