Chris Sidebottom
f95e7b0e32
Add infrastructure for BGEMM
...
Setting up all the infrastructure for BGEMM support in OpenBLAS, hopefully I found all the right places.
Derived mostly from the previous work done in https://github.com/OpenMathLib/OpenBLAS/pull/5287
Co-authored-by: Ye Tao <ye.tao@arm.com >
2025-07-08 16:22:41 +01:00
Srangrang
ec14e1648c
fix: resolve non-RISCV host build failed issue
...
- adjust interface to disable "small matrix" pathway
- separate HFLOAT16 from BFLOAT16
- remove SHGEMM_UNROLL_M and SHGEMM_UNROLL_N equal conditions
Related to PR#5290
Co-authored-by Martin
2025-06-15 20:25:15 +08:00
gkdddd
670ec6f757
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B
...
Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0
Related to issue #5279
Co-authored-by Linjin Li <linjin_li@163.com >
2025-06-03 20:14:30 +08:00
Martin Kroeker
5141a90993
Fix ARMV9SME target in DYNAMIC_ARCH and add SME query code for MacOS ( #5222 )
...
* Fix ARMV9SME target and add support_sme1 code for MacOS
* make sgemm_direct unconditionally available on all arm64
* build a (dummy) sgemm_direct kernel on all arm64
* Update dynamic_arm64.c
2025-05-10 22:39:32 +02:00
Vaisakh K V
d23eb3b93e
Support for SME1 based sgemm_direct kernel for cblas_sgemm level 3 API
...
* Added ARMV9SME target
* Added SGEMM_DIRECT kernel based on SME1
2025-02-13 14:51:21 +05:30
Martin Kroeker
46e331a917
remove the unworkable GEMM3M restriction from GENERIC again
2024-08-07 19:41:10 +02:00
Martin Kroeker
2787c9f8e4
Disable GEMM3M for generic targets (not implemented)
2024-06-06 14:39:50 +02:00
Martin Kroeker
04bc801999
(Re)apply fixes for supporting only a subset of precision types from PR 3915
2023-11-04 23:48:59 +01:00
Rajalakshmi Srinivasaraghavan
9f42570e33
POWER: Increase macro size limit for AIX
...
This patch increases the macro size limit from 4096 to 16384 to
allow compiling larger assembly files in AIX.
Tested with GCC and IBM Open XL C.
2023-10-12 12:37:40 -05:00
Rajalakshmi Srinivasaraghavan
71d733e5f7
POWER: Avoid m4 conversions for C files
...
This patch removes intermediate m4 conversions used in sbgemm
compilation as it is not needed for .c files.
Tested on AIX with gcc and IBM Open XL C.
2023-10-11 17:18:42 -05:00
Martin Kroeker
61d803547a
Apply USE_TRMM to MIPS64_GENERIC as to GENERIC
2023-08-06 15:17:38 +02:00
Martin Kroeker
898cf5faf3
Add Elbrus e2k architecture support
2022-01-22 18:55:10 +01:00
Bine Brank
b6a445cfd8
adapt Makefile for SVE trsm
2022-01-16 21:40:56 +01:00
Bine Brank
bb33446b40
fix makefile.L3
2022-01-06 10:26:11 +01:00
Bine Brank
07fa6fa3b1
configure Makefile for sve
2022-01-05 08:57:51 +01:00
Bine Brank
0140373802
add sve ztrmm
2022-01-02 19:15:33 +01:00
Bine Brank
774267fdac
adjust Makefile.L3 for SVE
2021-12-11 16:35:08 +01:00
Bine Brank
86ae89bf33
add sgemm kernel and copy functions for sgemm and ssymm
2021-11-28 18:12:47 +01:00
Bine Brank
9b9cb90bb1
modify Makefile for SVE copy
2021-11-22 09:54:20 +01:00
Bine Brank
9388f05a3c
configure SVE Makefile
2021-11-21 18:33:43 +01:00
Wangyang Guo
3dc6052c7e
initial support for Sapphire Rapids platform
2021-10-12 01:30:40 -07:00
Martin Kroeker
f1e3305974
Add workaround for Windows10 macro name clash
2021-09-01 21:36:50 +02:00
Wangyang Guo
619588fbab
sbgemm: remove unnecessary b0 files
2021-08-30 17:55:01 +08:00
Wangyang Guo
1d83ca4bca
Small Matrix: support BFLOAT16 data type
2021-08-30 17:40:20 +08:00
Wangyang Guo
989e6bbdd3
Small Matrix: reduce generic kernel source files
2021-08-13 03:17:38 +00:00
Wangyang Guo
5dc7c3c8e5
Small Matrix: add GEMM_SMALL_MATRIX_PERMIT to tune small matrics case
2021-08-02 07:06:54 +00:00
Xianyi Zhang
57ed58cefe
Refs #2587 Add small matrix optimization reference kernel for c/zgemm.
2021-08-02 07:06:54 +00:00
Xianyi Zhang
17d32a4a82
Change a1b0 gemm to b0 gemm.
2021-08-02 07:06:54 +00:00
Xianyi Zhang
59cb5de46b
Refs #2587 Fix typos.
2021-08-02 07:06:54 +00:00
Xianyi Zhang
be3349405d
Add alpha=1.0 beta=0.0 for small gemm.
2021-08-02 07:01:47 +00:00
Xianyi Zhang
0a2077901c
Add small marix optimization kernel interface.
...
make SMALL_MATRIX_OPT=1
2021-08-02 07:01:47 +00:00
Martin Kroeker
c4da892ba0
Only filter out -mavx on Sandybridge ZGEMM/ZTRMM kernels
2021-05-14 23:19:10 +02:00
Martin Kroeker
bd60fb6ffc
filter out -mavx flag on zgemm kernels as it can cause problems with older gcc
2021-05-13 23:05:00 +02:00
gxw
4b548857d6
Add msa support for loongson
...
1. Using core loongson3r3 and loongson3r4 for loongson
2. Add DYNAMIC_ARCH for loongson
Change-Id: I1c6b54dbeca3a0cc31d1222af36a7e9bd6ab54c1
2020-12-09 10:28:46 +08:00
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
2020-10-16 23:27:38 +08:00
Rajalakshmi Srinivasaraghavan
b5d30b390d
Fix build issues with bfloat16
...
This patch fixes compilation errors due to recent renaming from SH to SB
with BUILD_BFLOAT16.
2020-10-13 11:00:22 -05:00
Martin Kroeker
3aecafad80
Change "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-12 00:00:55 +02:00
Martin Kroeker
6b6adf8a4a
Allow compiling only a subset of kernels for specific variable types
2020-10-11 14:52:09 +02:00
Martin Kroeker
9ee21a0a39
Merge pull request #2780 from Guobing-Chen/CPL_build_support
...
Enable COOPERLAKE build target
2020-08-20 19:54:29 +02:00
Martin Kroeker
75eeb265d7
[WIP] Refactor the driver code for direct SGEMM ( #2782 )
...
Move "direct SGEMM" functionality out of the SkylakeX SGEMM kernel and make it available
(on x86_64 targets only for now) in DYNAMIC_ARCH builds
* Add sgemm_direct targets in the kernel Makefile.L3 and CMakeLists.txt
* Add direct_sgemm functions to the gotoblas struct in common_param.h
* Move sgemm_direct_performant helper to separate file
* Update gemm.c to macros for sgemm_direct to support dynamic_arch naming via common_s,h
* (Conditionally) add sgemm_direct functions in setparam-ref.c
2020-08-19 14:51:09 +02:00
Chen, Guobing
e740c4873d
Enable COOPERLAKE build target
...
Enable new build target platform -- COOPERLAKE. This target platform
supports all the SKYLAKEX supported ISAs + avx512bf16. So all the
SKYLAKEX specific kernels/drivers and related code are now extended
to be also active on COOPERLAKE. Besides, new BF16 related kernels
are active under this target.
2020-08-13 06:18:00 +08:00
Rajalakshmi Srinivasaraghavan
475b5c95b9
Remove extra symbol in Makefile
...
While trying out different unroll values, noted that
make failed due to this extra symbol.
2020-08-07 15:27:44 -05:00
Martin Kroeker
da17abec87
fix trailing whitespace
2020-07-14 18:20:03 +02:00
Martin Kroeker
b144423f0f
Do not define USE_TRMM for 32bit POWER8
2020-07-14 18:10:12 +02:00
Martin Kroeker
ed7e155c35
Merge branch 'develop' into aix
2020-07-07 18:52:06 +02:00
Martin Kroeker
c854ef5471
Fix variable names in conditional
2020-06-25 13:29:52 +02:00
Martin Kroeker
c0afc11742
Fix POWERPC builds on AIX (gcc/gfortran 7)
...
1. macro preprocessing for POWER8 and later kernels only
2. default buffer size used by AIX version of m4 is too small
2020-06-25 13:12:36 +02:00
Kavana Bhat
df4ade070f
Fix for #2671
2020-06-24 04:25:47 -05:00
Rajalakshmi Srinivasaraghavan
9fe930f205
powerpc: Add support for future processor
...
This is the initial patch to support build infrastructure
for POWER10 architecture.
2020-06-11 15:47:20 -05:00
Martin Kroeker
5dd14e3d48
Make building the bfloat16 functions conditional on option BUILD_HALF ( #2590 )
...
* make building the bfloat16 BLAS functions conditional on BUILD_HALF
* pass the BUILD_HALF option to gensymbol
* Pass BUILD_HALF as a compiler define for dynamic_arch builds
2020-05-01 09:58:30 +02:00