Commit Graph

405 Commits

Author SHA1 Message Date
Martin Kroeker
fc516af155 Merge branch 'develop' into issue5414 2025-10-01 14:12:59 -07:00
Chris Marsh
c1f607c43c fix -lto_library filtering for apple-clang + gfortran on sdk26 (#5474)
* ensure filter-out applies to subsequent sdk versions
2025-09-30 16:28:37 +02:00
Martin Kroeker
1c7251ca20 remove the -llto_library option for any osx fortran compiler 2025-09-02 18:36:02 +02:00
Martin Kroeker
ccfd0170fb Enable SME on MacOS and add VORTEXM4 to DYNAMIC_ARCH list 2025-08-18 01:50:13 -07:00
abhishek-fujitsu
4c8dcb3a8f Darwin/arm64: disable SVE/SME and fix gfortran link path 2025-07-26 16:59:46 +05:30
abhishek-fujitsu
05fc88180c ARM64: Enable bfloat16 kernels by default 2025-07-25 11:08:22 +05:30
Chris Sidebottom
2c3cdaf74e Optimized BGEMV for NEOVERSEV1 target
- Adds bgemv T based off of sbgemv T kernel
- Adds bgemv N which is slightly alterated to not use Y as an
accumulator due to the output being bf16 which results in loss of
precision
- Enables BGEMM_GEMV_FORWARD to proxy BGEMM to BGEMV with new kernels
2025-07-23 10:51:41 +01:00
Chris Sidebottom
f95e7b0e32 Add infrastructure for BGEMM
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.

Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287

Co-authored-by: Ye Tao <ye.tao@arm.com>
2025-07-08 16:22:41 +01:00
Martin Kroeker
d96daa220d Merge pull request #5290 from Srangrang/develop
Add support for FP16 to openBLAS and shgemm on RISCV
2025-06-24 23:10:15 +02:00
Martin Kroeker
12591caa91 Merge pull request #5334 from azuresky01/develop
Fix INTERFACE64 builds on Loongarch64 with LLVM
2025-06-24 16:09:25 +02:00
azuresky01
8953ba9c2f Fix INTERFACE64 builds on Loongarch64 with LLVM
fix https://github.com/OpenMathLib/OpenBLAS/issues/5331
2025-06-24 14:27:15 +08:00
davidz-ampere
be68ef03b4 Add support for Ampere processors 2025-06-15 22:00:40 -04:00
Srangrang
ec14e1648c fix: resolve non-RISCV host build failed issue
- adjust interface to disable "small matrix" pathway
- separate HFLOAT16 from BFLOAT16
- remove SHGEMM_UNROLL_M and SHGEMM_UNROLL_N equal conditions

Related to PR#5290
Co-authored-by Martin
2025-06-15 20:25:15 +08:00
Srangrang
4e1a381e5b fix: resolve the compilation failure without zfh instruction
- modify the macro conditions in Makefile.system
- Delete development test code

Related to issue#5279
2025-06-04 20:00:12 +08:00
gkdddd
670ec6f757 Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B
Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0

Related to issue #5279
Co-authored-by Linjin Li <linjin_li@163.com>
2025-06-03 20:14:30 +08:00
Srangrang
0a967797a1 Add FP16 support for RISCV 2025-05-27 14:34:57 +08:00
Ye Tao
7321444660 enable sbgemm to be forward to sbgemv on arm64 2025-05-12 15:37:32 +00:00
Harmen Stoppels
51ba70f47b test_potrs.c: remove pragma darwin-aarch64 support
Using GCC 14.2.0 on Darwin, the pragma ultimately causes a linker error
"ld: invalid r_symbolnum=". The current workaround is to use the old
linker, but (a) it's deprecated and (b) it can produce libraries that
are subsequently not linkable with the newer linker in dependents: the
new ld64 does not link to libraries with duplicate rpaths created by the
classic linker.
2025-04-10 15:20:34 +02:00
Martin Kroeker
1ed962d259 Fix compilation with xcode16.3/clang17/gcc14 2025-04-06 10:44:48 -07:00
Vaisakh K V
f66ca05b31 Merge branch 'develop' into topic/sgemm_direct_sme1 2025-02-13 14:54:37 +05:30
Vaisakh K V
d23eb3b93e Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API
* Added ARMV9SME target
* Added SGEMM_DIRECT kernel based on SME1
2025-02-13 14:51:21 +05:30
Martin Kroeker
c1258662db Merge branch 'OpenMathLib:develop' into m3m_exprec 2024-12-30 15:58:15 +01:00
Martin Kroeker
9db51f790a Remove any optimization flags from DEBUG builds on POWER architecture 2024-11-17 23:19:58 +01:00
Martin Kroeker
d04686acd8 Re-enable the EXPRECISION option for non-Windows x86/x86_64 2024-11-14 14:09:01 -08:00
Caroline Newcombe
760bf7aa37 Update Fortran return for complex data types (Cray and Nvidia compilers) 2024-11-13 14:05:20 -06:00
Chip Kerchner
36bd3eeddf Vectorize BF16 GEMV (VSX & MMA). Use GEMM_GEMV_FORWARD_BF16 (for Power). 2024-10-13 13:46:11 -05:00
Martin Kroeker
a492181665 filter out Loongarch -mabi options for flang-new 2024-10-03 15:58:47 +02:00
Martin Kroeker
a1073f5eed Merge pull request #4900 from XiWeiGu/la64_core_rename
LoongArch64: Rename core
2024-10-01 15:29:16 +02:00
gxw
48698b2b1d LoongArch64: Rename core
Use microarchitecture name instead of meaningless strings to name the core,
the legacy core is still retained.
1. Rename LOONGSONGENERIC to LA64_GENERIC
2. Rename LOONGSON3R5 to LA464
3. Rename LOONGSON2K1000 to LA264
2024-09-29 09:35:21 +08:00
Martin Kroeker
969bb949b1 Strip any mtune option from FFLAGS is the compiler is flang-new 2024-09-19 11:10:28 +02:00
Martin Kroeker
383e0b133e remove suppression of gcc14's incompatible pointer error 2024-09-11 22:21:09 +02:00
Martin Kroeker
42d8865234 fix typo 2024-08-01 12:24:45 +02:00
Martin Kroeker
fcb88b9d52 enable GEMM/GEMV forwarding for riscv and ppc 2024-07-31 23:21:35 +02:00
Chris Sidebottom
b26424c6a2 Allow opt into GEMM -> GEMV forwarding 2024-07-31 13:09:14 +01:00
Martin Kroeker
a4e56e0452 Merge pull request #4806 from Mousius/small-gemm
Small GEMM for AArch64 with SVE
2024-07-25 21:50:04 +02:00
yamazaki-mitsufumi
821ef34635 Add A64FX to the list of CPUs supported by DYNAMIC_ARCH 2024-07-23 20:44:39 +09:00
Mark Ryan
3b715e6162 Add autodetection for riscv64
Implement DYNAMIC_ARCH support for riscv64.  Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly.  Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0.  The
approach taken is to first try hwprobe.  If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.

Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.

A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.

Signed-off-by: Mark Ryan <markdryan@rivosinc.com>
2024-07-15 14:24:22 +00:00
gxw
8ab2e9ec65 LoongArch: DGEMM small matrix opt 2024-06-04 16:52:45 +08:00
Martin Kroeker
4376b6f7d2 Restore Loongson LA64ARCH handling 2024-05-07 14:42:01 +02:00
Martin Kroeker
fc10673fd3 Merge branch 'develop' into hugetlb-doc 2024-05-07 13:31:39 +02:00
Martin Kroeker
9c4e10fbd1 sort hugetlb and shm alloc options 2024-05-04 14:48:02 +02:00
Martin Kroeker
7c915e64ca Silence a GCC14 warning/error in the f2c-converted LAPACK 2024-04-30 17:48:14 +02:00
Martin Kroeker
ae695d4ca0 Merge pull request #4642 from XiWeiGu/loongarch64_clang
CI: Add clang test for loongarch64
2024-04-23 18:25:49 +02:00
gxw
7cd438a5ac loongarch64: Fixed clang compilation issues 2024-04-23 19:19:11 +08:00
Martin Kroeker
0ec0746ae4 Update Makefile.system 2024-04-18 16:11:20 +02:00
Martin Kroeker
d6b0badc05 Fix declarations for EMBEDDED 2024-04-18 16:06:21 +02:00
Martin Kroeker
00ee5d0367 On ARM, do not assume -marm by default if OS_EMBEDDED=1 2024-04-12 15:59:45 +02:00
Chip Kerchner
1c13cda3fc Remove -openmp flag from XLF (since it doesn't support it). 2024-04-10 15:16:47 -05:00
Martin Kroeker
52b71a1673 Filter out FFLAGS that flang-new from LLVM18 no longer supports (#4569)
* Filter out FFLAGS that flang-new from LLVM18 no longer supports
2024-03-22 17:02:39 +01:00
Martin Kroeker
a14176440a Add version macro for GCC12 2024-03-10 23:22:05 +01:00