Martin Kroeker
04bb5acd79
change BLAS_HALF to BLAS_BFLOAT16 (another missed rename)
2025-07-08 14:40:22 +02:00
Chris Sidebottom
7a97c4ca97
Rename HALF -> BFLOAT16 in some more places
2025-07-07 10:13:39 +00:00
Usui, Tetsuzo
14107e37d9
Add parallel laed3
2025-07-01 22:12:27 +09:00
gkdddd
670ec6f757
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B
...
Added HFLOAT16 support for RISCV64
Added shgemm_kernel_8x8 for RISCV64_ZVL128B and shgemm_kernel_16x8 for RISCV64_ZVL256B based on HFLOAT16
The instruction sets used are ZVFH and ZFH, which need to be supported by RVV1.0
Related to issue #5279
Co-authored-by Linjin Li <linjin_li@163.com >
2025-06-03 20:14:30 +08:00
Ruiyang Wu
02fd1df10b
CMake: Pass OpenMP compiler and linker flags through CMake targets
...
Using `OpenMP::OpenMP_LANG` targets for CMake is less error-prone than
passing the compiler and linker flags manually. Furthermore, it allows
the user to customize those flags by setting `OpenMP_LANG_FLAGS`,
`OpenMP_LANG_LIB_NAMES`, and `OpenMP_omp_LIBRARY`.
2025-03-26 23:09:54 -04:00
Martin Kroeker
2332ea7e7a
fix misleading indentation
2024-11-06 18:35:31 +01:00
Martin Kroeker
73e13b0273
flesh out HERK prototype
2024-08-12 14:45:40 +02:00
Martin Kroeker
824306baab
flesh out HERK prototype
2024-08-12 14:44:13 +02:00
Mark Ryan
3b715e6162
Add autodetection for riscv64
...
Implement DYNAMIC_ARCH support for riscv64. Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly. Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0. The
approach taken is to first try hwprobe. If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.
Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.
A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com >
2024-07-15 14:24:22 +00:00
Martin Kroeker
2dda40d280
use atomic operations as in the corresponding getrf
2024-03-28 11:33:31 +01:00
Dirreke
ec89466e14
Add CSKY support
2024-01-16 23:45:06 +08:00
Martin Kroeker
1d4aa8d7d5
fix improper function prototypes (empty parentheses)
2023-09-30 13:00:51 +02:00
Martin Kroeker
f4f31fb53b
fix improper function prototypes (empty parentheses)
2023-09-30 12:59:44 +02:00
gxw
d15e0a055c
LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH
2023-09-27 10:05:27 +08:00
Martin Kroeker
3b6050ac04
clarify the comment on the out-of-bounds check from #723
2023-08-26 02:00:00 +02:00
Martin Kroeker
22a402bc2c
clarify the comment on the out-of-bounds check from #723
2023-08-26 01:58:08 +02:00
Martin Kroeker
437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio
...
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
2023-04-19 11:51:54 +02:00
Chris Sidebottom
32f2fafde7
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
...
Previously dynamic builds were either using the default SWITCH_RATIO
or one from the higher level architecture; this patch ensures the
dynamic builds can use this parameter as well.
2023-04-17 15:34:12 +01:00
Martin Kroeker
6c431239da
Split test condition in LU computation - non-denormal for computation, exact zero for reporting singularity
2023-03-29 22:14:21 +02:00
Martin Kroeker
12aabb9f9b
fix conditional
2023-03-29 09:44:33 +02:00
Martin Kroeker
f3d21039ce
Improve fix from PR3924 ( #3941 )
...
* compare denominator against DBL_MIN rather than a somewhat arbitrary small number near it
2023-03-16 15:09:32 +01:00
Martin Kroeker
3d27cbd9a3
avoid overflow in division
2023-02-26 23:44:14 +01:00
Martin Kroeker
a39ced0551
avoid overflow in division
2023-02-26 23:42:20 +01:00
Martin Kroeker
aa2a2d9c01
Conditionally compile files that may get replaced by ReLAPACK
2022-11-08 12:04:46 +01:00
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
...
WIP casts and cleanups
2022-02-06 23:55:06 +01:00
Martin Kroeker
40003f8edb
Fix pivot offset calculation for negative incx
2022-01-17 00:11:18 +01:00
Martin Kroeker
57e2a72f40
Fix pivot offset calculation for negative incx
2022-01-17 00:10:21 +01:00
Martin Kroeker
3b6293f5a0
Fix offset calculation for negative incx
2022-01-17 00:09:14 +01:00
Martin Kroeker
afa0cece5c
Fix pivot offset calculation for negative incx
2022-01-17 00:08:20 +01:00
Martin Kroeker
eca2f50b48
Fix pivot offset calculation for negative incx
2022-01-17 00:07:33 +01:00
Martin Kroeker
0e9e951306
Fix pivot offset calculation for negative incx
2022-01-17 00:06:41 +01:00
Martin Kroeker
1b49ef8dcf
Fix pivot index for negative increments
2022-01-17 00:05:33 +01:00
Martin Kroeker
6b407a16cb
fix function typecasts
2021-12-21 18:51:28 +01:00
Martin Kroeker
aecb4a5e8d
fix function typecasts
2021-12-21 18:50:22 +01:00
Martin Kroeker
c49d46f25f
fix function typecast
2021-12-21 18:49:18 +01:00
gxw
af0a69f355
Add support for LOONGARCH64
2021-07-27 15:29:12 +08:00
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
2020-10-16 23:27:38 +08:00
Martin Kroeker
4bb73c0171
Rename "HALF" type to "BFLOAT16"
2020-10-13 20:07:19 +02:00
Martin Kroeker
32733ded04
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-11 23:52:45 +02:00
Martin Kroeker
b27ca78a21
Adapt to having only a subset of variable types supported
2020-10-11 14:46:24 +02:00
Martin Kroeker
93454022a9
Adapt to having only a subset of variable types supported
2020-10-11 14:45:40 +02:00
Martin Kroeker
20cf1d773f
Adapt to having only a subset of variable types supported
2020-10-11 14:44:56 +02:00
Martin Kroeker
5c657fffad
Adapt to having only a subset of variable types supported
2020-10-11 14:44:13 +02:00
Martin Kroeker
b262058059
Adapt to having only a subset of variable types supported
2020-10-11 14:43:13 +02:00
Martin Kroeker
bc319cee82
Adapt to having only a subset of variable types supported
2020-10-11 14:42:26 +02:00
Martin Kroeker
e5966f8606
Adapt to having only a subset of variable types supported
2020-10-11 14:41:43 +02:00
Martin Kroeker
9df12eb08f
Adapt to having only a subset of variable types supported
2020-10-11 14:40:51 +02:00
Martin Kroeker
cf53970bcb
Adapt to having only a subset of variable types supported
2020-10-11 14:40:06 +02:00
Martin Kroeker
dcd51d5c72
Adapt to having only a subset of variable types supported
2020-10-11 14:39:19 +02:00
Martin Kroeker
b8f95354c7
Adapt to having only a subset of variable types supported
2020-10-11 14:38:25 +02:00