Commit Graph

2624 Commits

Author SHA1 Message Date
Martin Kroeker
4d08156266 Use the generic C kernel for DNRM2 2026-01-11 21:58:31 +01:00
Martin Kroeker
d1de282a4e Improve the precision of S/CNRM2 by summing in double precision 2026-01-11 13:04:00 +01:00
Martin Kroeker
c040d5ed86 Merge pull request #5591 from quic/topic/ssyr2k_direct_sme1
Some checks failed
apple m / build (cmake, gfortran, 0, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 0, 1) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 1) (push) Has been cancelled
apple m / build (make, gfortran, 0, 0) (push) Has been cancelled
apple m / build (make, gfortran, 0, 1) (push) Has been cancelled
apple m / build (make, gfortran, 1, 0) (push) Has been cancelled
apple m / build (make, gfortran, 1, 1) (push) Has been cancelled
arm64 graviton cirun / build (cmake, gfortran) (push) Has been cancelled
arm64 graviton cirun / build (make, gfortran) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=C910V, C910V, riscv64-unknown-linux-gnu) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=RISCV64_GENERIC, RISCV64_GENERIC, riscv64-linux-gnu) (push) Has been cancelled
Run codspeed benchmarks / benchmarks (make, gfortran, ubuntu-22.04, 3.12) (push) Has been cancelled
Publish docs via GitHub Pages / Deploy docs (push) Has been cancelled
continuous build / build (cmake, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / msys2 (None, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, MINGW32, mingw-w64-i686) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / cross_build (DYNAMIC_ARCH=1 TARGET=GENERIC, mips64el, mips64el-linux-gnuabi64) (push) Has been cancelled
continuous build / cross_build (TARGET=EV4, alpha, alpha-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=MIPS1004K, mipsel, mipsel-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=RISCV64_GENERIC, riscv64, riscv64-linux-gnu) (push) Has been cancelled
continuous build / neoverse_build (push) Has been cancelled
harmonyos / build (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6400, I6400, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6500, I6500, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=MIPS64_GENERIC, MIPS64_GENERIC, mips64el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=P6600, P6600, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=SICORTEX, SICORTEX, mips64el-linux-gnuabi64) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_GENERIC BINARY=64 ARCH=riscv64 DYNAMIC_ARCH=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64, DYNAMIC_ARCH=1) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL128B BINARY=64 ARCH=riscv64, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=128,elen=64, RISCV64_ZVL128B) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL256B BINARY=64 ARCH=riscv64 BUILD_BFLOAT16=1 BUILD_HFLOAT16=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64,zfh=true,zvfh=true,zvfbfwma=true, RISCV64_ZVL256B) (push) Has been cancelled
Windows ARM64 CI / build (push) Has been cancelled
Nightly-Homebrew-Build / build-OpenBLAS-with-Homebrew (push) Has been cancelled
Support for SME1 based ssyr2k_direct kernel for cblas_ssyr2k level 3 API
2026-01-08 15:47:38 +01:00
Zhiqing xie
6939a43c3b Support for SME1 based ssyr2k_direct kernel for cblas_ssyr2k level 3 API 2026-01-08 11:09:04 +08:00
Amrita H S
b53d18b3ad Fixing warning messages in dgemm and dgemv kernels
Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
2026-01-06 10:20:56 -06:00
Rajalakshmi Srinivasaraghavan
2283fcbbe7 POWER10: Reduce sgemm loop unrolling
With GCC 14, unnecessary move and lxvp instructions appear when unrolling the inner loop for larger sizes.
Reducing the loop unroll factor restores performance to GCC 11.
2026-01-04 17:01:01 -06:00
Martin Kroeker
d39b77748f Make .align conditional on not being on WoA and strip CRLF endings 2025-12-24 20:00:45 +01:00
Martin Kroeker
ac2c66321d remove special handling of C/ZDOT for LLVM on WoA 2025-12-19 17:04:21 +01:00
Martin Kroeker
cfa28bcf71 Support compilation with LLVM for Windows on Arm 2025-12-19 17:00:47 +01:00
pengxu
f6533ccea0 Fix floating point registers ld/st bug of Loongarch 2025-12-03 10:52:59 +08:00
Martin Kroeker
d6b25c43c6 Merge pull request #5542 from abhishek-iitmadras/abhishek_new_tt_a64fx
Some checks failed
apple m / build (cmake, gfortran, 0, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 0, 1) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 1) (push) Has been cancelled
apple m / build (make, gfortran, 0, 0) (push) Has been cancelled
apple m / build (make, gfortran, 0, 1) (push) Has been cancelled
apple m / build (make, gfortran, 1, 0) (push) Has been cancelled
apple m / build (make, gfortran, 1, 1) (push) Has been cancelled
arm64 graviton cirun / build (cmake, gfortran) (push) Has been cancelled
arm64 graviton cirun / build (make, gfortran) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=C910V, C910V, riscv64-unknown-linux-gnu) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=RISCV64_GENERIC, RISCV64_GENERIC, riscv64-linux-gnu) (push) Has been cancelled
Run codspeed benchmarks / benchmarks (make, gfortran, ubuntu-22.04, 3.12) (push) Has been cancelled
Publish docs via GitHub Pages / Deploy docs (push) Has been cancelled
continuous build / build (cmake, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / msys2 (None, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, MINGW32, mingw-w64-i686) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / cross_build (DYNAMIC_ARCH=1 TARGET=GENERIC, mips64el, mips64el-linux-gnuabi64) (push) Has been cancelled
continuous build / cross_build (TARGET=EV4, alpha, alpha-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=MIPS1004K, mipsel, mipsel-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=RISCV64_GENERIC, riscv64, riscv64-linux-gnu) (push) Has been cancelled
continuous build / neoverse_build (push) Has been cancelled
harmonyos / build (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6400, I6400, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6500, I6500, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=MIPS64_GENERIC, MIPS64_GENERIC, mips64el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=P6600, P6600, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=SICORTEX, SICORTEX, mips64el-linux-gnuabi64) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_GENERIC BINARY=64 ARCH=riscv64 DYNAMIC_ARCH=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64, DYNAMIC_ARCH=1) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL128B BINARY=64 ARCH=riscv64, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=128,elen=64, RISCV64_ZVL128B) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL256B BINARY=64 ARCH=riscv64 BUILD_BFLOAT16=1 BUILD_HFLOAT16=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64,zfh=true,zvfh=true,zvfbfwma=true, RISCV64_ZVL256B) (push) Has been cancelled
Windows ARM64 CI / build (push) Has been cancelled
Nightly-Homebrew-Build / build-OpenBLAS-with-Homebrew (push) Has been cancelled
[A64FX]: add tt for a64fx dot
2025-11-23 22:55:51 +01:00
Martin Kroeker
f7b7296bff Fix compilation with LLVM 2025-11-22 16:07:34 +01:00
Abhishek Kumar
a14caf464f add tt for a64fx dot
Signed-off-by: Abhishek Kumar <abhishek.r.kumar@fujitsu.com>
2025-11-20 12:14:17 +05:30
Martin Kroeker
28eeef5bbe Merge pull request #5538 from CheryDan/riscv/rot
Some checks failed
apple m / build (cmake, gfortran, 0, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 0, 1) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 1) (push) Has been cancelled
apple m / build (make, gfortran, 0, 0) (push) Has been cancelled
apple m / build (make, gfortran, 0, 1) (push) Has been cancelled
apple m / build (make, gfortran, 1, 0) (push) Has been cancelled
apple m / build (make, gfortran, 1, 1) (push) Has been cancelled
arm64 graviton cirun / build (cmake, gfortran) (push) Has been cancelled
arm64 graviton cirun / build (make, gfortran) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=C910V, C910V, riscv64-unknown-linux-gnu) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=RISCV64_GENERIC, RISCV64_GENERIC, riscv64-linux-gnu) (push) Has been cancelled
Run codspeed benchmarks / benchmarks (make, gfortran, ubuntu-22.04, 3.12) (push) Has been cancelled
Publish docs via GitHub Pages / Deploy docs (push) Has been cancelled
continuous build / build (cmake, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (cmake, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, clang-21, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-24.04-arm) (push) Has been cancelled
continuous build / build (make, gcc, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / msys2 (None, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, MINGW32, mingw-w64-i686) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / cross_build (DYNAMIC_ARCH=1 TARGET=GENERIC, mips64el, mips64el-linux-gnuabi64) (push) Has been cancelled
continuous build / cross_build (TARGET=EV4, alpha, alpha-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=MIPS1004K, mipsel, mipsel-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=RISCV64_GENERIC, riscv64, riscv64-linux-gnu) (push) Has been cancelled
continuous build / neoverse_build (push) Has been cancelled
harmonyos / build (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6400, I6400, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6500, I6500, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=MIPS64_GENERIC, MIPS64_GENERIC, mips64el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=P6600, P6600, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=SICORTEX, SICORTEX, mips64el-linux-gnuabi64) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_GENERIC BINARY=64 ARCH=riscv64 DYNAMIC_ARCH=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64, DYNAMIC_ARCH=1) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL128B BINARY=64 ARCH=riscv64, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=128,elen=64, RISCV64_ZVL128B) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL256B BINARY=64 ARCH=riscv64 BUILD_BFLOAT16=1 BUILD_HFLOAT16=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64,zfh=true,zvfh=true,zvfbfwma=true, RISCV64_ZVL256B) (push) Has been cancelled
Windows ARM64 CI / build (push) Has been cancelled
Nightly-Homebrew-Build / build-OpenBLAS-with-Homebrew (push) Has been cancelled
Optimize ZROT_RVV for the unit-stride case (inc_x = inc_y = 1)
2025-11-19 07:33:08 +01:00
Martin Kroeker
17f2e94260 Merge pull request #5539 from FRosner/arm64-dot-kernel-refactoring
Refactoring: ARM64 dot Kernel: don't call num_cpu_avail twice
2025-11-18 23:25:51 +01:00
Frank Rosner
762ed66c72 Refactoring: ARM64 dot Kernel: don't call num_cpu_avail twice 2025-11-18 15:34:51 +01:00
daichengrong
98a8230dee Optimize ZROT_RVV for the unit-stride case (inc_x = inc_y = 1) 2025-11-18 17:34:27 +08:00
mayeut
39d5e44723 fix: dot_kernel_sve "n" usage & clobber list 2025-11-17 21:53:51 +01:00
Martin Kroeker
f2d010de12 Merge pull request #5512 from quic/topic/ssyrk_direct_sme1
Some checks failed
apple m / build (cmake, gfortran, 0, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 0, 1) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 0) (push) Has been cancelled
apple m / build (cmake, gfortran, 1, 1) (push) Has been cancelled
apple m / build (make, gfortran, 0, 0) (push) Has been cancelled
apple m / build (make, gfortran, 0, 1) (push) Has been cancelled
apple m / build (make, gfortran, 1, 0) (push) Has been cancelled
apple m / build (make, gfortran, 1, 1) (push) Has been cancelled
arm64 graviton cirun / build (cmake, gfortran) (push) Has been cancelled
arm64 graviton cirun / build (make, gfortran) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=C910V, C910V, riscv64-unknown-linux-gnu) (push) Has been cancelled
c910v qemu test / TEST (riscv64-linux-gnu, NO_SHARED=1 TARGET=RISCV64_GENERIC, RISCV64_GENERIC, riscv64-linux-gnu) (push) Has been cancelled
Run codspeed benchmarks / benchmarks (make, gfortran, ubuntu-22.04, 3.12) (push) Has been cancelled
Publish docs via GitHub Pages / Deploy docs (push) Has been cancelled
continuous build / build (cmake, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (cmake, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (cmake, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, flang, ubuntu-latest) (push) Has been cancelled
continuous build / build (make, gfortran, macos-latest) (push) Has been cancelled
continuous build / build (make, gfortran, ubuntu-latest) (push) Has been cancelled
continuous build / msys2 (None, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, MINGW32, mingw-w64-i686) (push) Has been cancelled
continuous build / msys2 (Release, fc, int32, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, CLANG64, mingw-w64-clang-x86_64) (push) Has been cancelled
continuous build / msys2 (Release, fc, int64, -DBINARY=64 -DINTERFACE64=1, UCRT64, mingw-w64-ucrt-x86_64) (push) Has been cancelled
continuous build / cross_build (DYNAMIC_ARCH=1 TARGET=GENERIC, mips64el, mips64el-linux-gnuabi64) (push) Has been cancelled
continuous build / cross_build (TARGET=EV4, alpha, alpha-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=MIPS1004K, mipsel, mipsel-linux-gnu) (push) Has been cancelled
continuous build / cross_build (TARGET=RISCV64_GENERIC, riscv64, riscv64-linux-gnu) (push) Has been cancelled
continuous build / neoverse_build (push) Has been cancelled
harmonyos / build (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC, loongarch64-linux-gnu) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=GENERIC, DYNAMIC_ARCH) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA264, LA264) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA464, LA464) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LA64_GENERIC, LA64_GENERIC) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON2K1000, LOONGSON2K1000) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSON3R5, LOONGSON3R5) (push) Has been cancelled
loongarch64 clang qemu test / TEST (NO_SHARED=1 DYNAMIC_ARCH=1 TARGET=LOONGSONGENERIC, LOONGSONGENERIC) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6400, I6400, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=I6500, I6500, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=MIPS64_GENERIC, MIPS64_GENERIC, mips64el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=P6600, P6600, mipsisa64r6el-linux-gnuabi64) (push) Has been cancelled
mips64 qemu test / TEST (NO_SHARED=1 TARGET=SICORTEX, SICORTEX, mips64el-linux-gnuabi64) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_GENERIC BINARY=64 ARCH=riscv64 DYNAMIC_ARCH=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64, DYNAMIC_ARCH=1) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL128B BINARY=64 ARCH=riscv64, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=128,elen=64, RISCV64_ZVL128B) (push) Has been cancelled
riscv64 zvl256b qemu test / TEST (TARGET=RISCV64_ZVL256B BINARY=64 ARCH=riscv64 BUILD_BFLOAT16=1 BUILD_HFLOAT16=1, rv64,g=true,c=true,v=true,vext_spec=v1.0,vlen=256,elen=64,zfh=true,zvfh=true,zvfbfwma=true, RISCV64_ZVL256B) (push) Has been cancelled
Windows ARM64 CI / build (push) Has been cancelled
Nightly-Homebrew-Build / build-OpenBLAS-with-Homebrew (push) Has been cancelled
Support for SME1 based ssyrk_direct kernel for cblas_ssyrk level 3 API
2025-11-06 14:06:43 -08:00
Chip Kerchner
00a7336fc9 Missing one gemv conversion. 2025-11-04 22:27:53 +00:00
Chip Kerchner
edf2e5900c Prevent possible conversion from bfloat16 to __bf16. 2025-11-04 21:00:37 +00:00
Martin Kroeker
0c59ae0b45 Merge pull request #5453 from pratiklp00/dgemm_optimization
Dgemm loop unroll and 4x1, 4x2 dgemv VSX implementation for power10.
2025-10-28 16:51:41 -07:00
Martin Kroeker
585e6d0680 Merge pull request #5515 from iha-taisei/feature/ger_unroll
Improve single-thread performance of [SD]GER on A64FX and Neoverse V1
2025-10-24 08:17:06 -07:00
Iha, Taisei
cb66aca707 Improve single-thread performance of [SD]GER on A64FX and Neoverse V1 2025-10-22 19:56:14 +09:00
Yichao Yu
3d19d3b60a Make dummy function have the same linkage as the real one 2025-10-20 12:42:39 -04:00
changjua
43d38d336f Support for SME1 based ssyrk_direct kernel for cblas_ssyrk level 3 API 2025-10-20 11:35:20 +08:00
pratiklp00
6637352260 remmove spacing 2025-10-14 00:06:04 -05:00
Yichao Yu
b94e9b92ad Fix compilation on ARM
Define a dummy function if SME is not supported, following what sgemm does
2025-10-11 20:28:59 -04:00
Martin Kroeker
e40714cabd Merge pull request #5450 from quic/topic/strmm_direct_sme1
Support for SME1 based strmm_direct kernel for cblas_strmm level 3 API
2025-10-11 15:20:19 -07:00
changjua
644ea07ef9 Support for SME1 based strmm_direct kernel for cblas_strmm level 3 API 2025-10-10 10:48:27 +08:00
pratiklp00
e2399be6d2 add macro 2025-10-08 23:24:41 -05:00
Chip Kerchner
03a83778bb Tie in SHGEMV for RISC-V. 2025-10-08 14:08:29 +00:00
Martin Kroeker
49eca84eaf Merge pull request #5478 from martin-frbg/issue5477
Change all aligned moves in x86_64 MIN/MAX to unaligned
2025-10-08 02:46:00 -07:00
Martin Kroeker
46fc6c0794 fix unspecified array size in clobber list 2025-10-08 08:23:24 +02:00
Martin Kroeker
064751ee65 Merge pull request #5481 from ChipKerchner/vectorSBGEMV
Add SBGEMV and SHGEMV routines to RISC-V
2025-10-07 13:31:03 -07:00
Chip Kerchner
f552040c5d Fix stride issue. 2025-10-07 17:17:18 +00:00
Chris Sidebottom
37fc3bbca0 Add Infrastructure for SHGEMV
This adds all the relevant bits and pieces to add a `shgemv` path as
well as a future `hgemm`/`hgemv` path in a similar model to `sb` and `b`
interfaces.

I've also fixed a few bits and pieces around `shgemm` which didn't build
in a few situations.
2025-10-07 15:03:24 +00:00
Chip Kerchner
aecb7f9537 Change signature of SBGEMV. 2025-10-07 13:14:20 +00:00
Chris Sidebottom
958f721e36 Beta fix for generic gemv T 2025-10-07 10:01:12 +00:00
Chris Sidebottom
578e7dae85 Fix bf16->f32 conversion for NEOVERSEV1 and NEOVERSEN2 targets
This fixes an issue originally introduced with the BGEMM kernel.

I've updated the tests to run with `beta=1.0` so as to test loading and
updating from C.

Alongside this, the tests now return sensible return values to reduce
the risk of them being ignored.

Also fixed a bug in `generic/gemv_t.c` resulting in weird outputs for
`bgemv`.
2025-10-06 18:05:58 +00:00
Chip Kerchner
809e1cba8f Better FP16 vectorized GEMV - 20% faster. 2025-10-06 13:19:03 +00:00
Chip Kerchner
e07a9ae418 Merge branch 'develop' into vectorSBGEMV 2025-10-03 17:13:29 +00:00
Chip Kerchner
588f0e87cc Add SBGEMV and SHGEMV routines to RISC-V. 2025-10-03 17:09:16 +00:00
Martin Kroeker
b48a089d75 Change all aligned moves to unaligned 2025-10-01 23:36:48 +02:00
Martin Kroeker
e939c6c315 Merge pull request #5471 from quic/topic/ssymm_direct_sme1
Support for SME1 based ssymm_direct kernel for cblas_ssymm level 3 API
2025-10-01 06:22:36 -07:00
Chip Kerchner
36f9cb85b1 Fix pre-RVV 1.0. 2025-09-30 22:41:31 +00:00
Chip Kerchner
2d82d144e2 Tranverse matrix data in a cache friendly manner for GEMV_N (RISCV). 2025-09-30 21:22:10 +00:00
Martin Kroeker
aaa5c377bc Merge pull request #5465 from ChipKerchner/addRVVVectorizedFP16Packing
Add vectorized packing for FP16 and BF16 for RISC-V.  Reactivate vector packing for FP64 transposed
2025-09-30 09:21:15 -07:00
Rajendra Prasad Matcha
19268471cc Support for SME1 based ssymm_direct kernel for cblas_ssymm level 3 API 2025-09-30 15:05:33 +05:30
Chip Kerchner
67ddda394e Merge branch 'develop' into addRVVVectorizedFP16Packing 2025-09-29 13:49:57 +00:00