Martin Kroeker
2332ea7e7a
fix misleading indentation
2024-11-06 18:35:31 +01:00
Martin Kroeker
73e13b0273
flesh out HERK prototype
2024-08-12 14:45:40 +02:00
Martin Kroeker
824306baab
flesh out HERK prototype
2024-08-12 14:44:13 +02:00
Mark Ryan
3b715e6162
Add autodetection for riscv64
...
Implement DYNAMIC_ARCH support for riscv64. Three cpu types are
supported, riscv64_generic, riscv64_zvl256b, riscv64_zvl128b.
The two non-generic kernels require CPU support for RVV 1.0 to
function correctly. Detecting that a riscv64 device supports
RVV 1.0 is a little complicated as there are some boards on the
market that advertise support for V via hwcap but only support
RVV 0.7.1, which is not binary compatible with RVV 1.0. The
approach taken is to first try hwprobe. If hwprobe is not
available, we fall back to hwcap + an additional check to distinguish
between RVV 1.0 and RVV 0.7.1.
Tested on a VM with VLEN=256, a CanMV K230 with VLEN=128 (with only
the big core enabled), a Lichee Pi with RVV 0.7.1 and a VF2 with no
vector.
A compiler with RVV 1.0 support must be used to build OpenBLAS for
riscv64 when DYNAMIC_ARCH=1.
Signed-off-by: Mark Ryan <markdryan@rivosinc.com >
2024-07-15 14:24:22 +00:00
Martin Kroeker
2dda40d280
use atomic operations as in the corresponding getrf
2024-03-28 11:33:31 +01:00
Dirreke
ec89466e14
Add CSKY support
2024-01-16 23:45:06 +08:00
Martin Kroeker
1d4aa8d7d5
fix improper function prototypes (empty parentheses)
2023-09-30 13:00:51 +02:00
Martin Kroeker
f4f31fb53b
fix improper function prototypes (empty parentheses)
2023-09-30 12:59:44 +02:00
gxw
d15e0a055c
LoongArch64: Fixed compilation issues when enable DYNAMIC_ARCH
2023-09-27 10:05:27 +08:00
Martin Kroeker
3b6050ac04
clarify the comment on the out-of-bounds check from #723
2023-08-26 02:00:00 +02:00
Martin Kroeker
22a402bc2c
clarify the comment on the out-of-bounds check from #723
2023-08-26 01:58:08 +02:00
Martin Kroeker
437c0bf2b4
Merge pull request #3843 from Mousius/switch-ratio
...
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
2023-04-19 11:51:54 +02:00
Chris Sidebottom
32f2fafde7
Propagate SWITCH_RATIO to DYNAMIC_ARCH builds
...
Previously dynamic builds were either using the default SWITCH_RATIO
or one from the higher level architecture; this patch ensures the
dynamic builds can use this parameter as well.
2023-04-17 15:34:12 +01:00
Martin Kroeker
6c431239da
Split test condition in LU computation - non-denormal for computation, exact zero for reporting singularity
2023-03-29 22:14:21 +02:00
Martin Kroeker
12aabb9f9b
fix conditional
2023-03-29 09:44:33 +02:00
Martin Kroeker
f3d21039ce
Improve fix from PR3924 ( #3941 )
...
* compare denominator against DBL_MIN rather than a somewhat arbitrary small number near it
2023-03-16 15:09:32 +01:00
Martin Kroeker
3d27cbd9a3
avoid overflow in division
2023-02-26 23:44:14 +01:00
Martin Kroeker
a39ced0551
avoid overflow in division
2023-02-26 23:42:20 +01:00
Martin Kroeker
aa2a2d9c01
Conditionally compile files that may get replaced by ReLAPACK
2022-11-08 12:04:46 +01:00
Martin Kroeker
7656aba00e
Merge pull request #3493 from martin-frbg/casts+cleanup
...
WIP casts and cleanups
2022-02-06 23:55:06 +01:00
Martin Kroeker
40003f8edb
Fix pivot offset calculation for negative incx
2022-01-17 00:11:18 +01:00
Martin Kroeker
57e2a72f40
Fix pivot offset calculation for negative incx
2022-01-17 00:10:21 +01:00
Martin Kroeker
3b6293f5a0
Fix offset calculation for negative incx
2022-01-17 00:09:14 +01:00
Martin Kroeker
afa0cece5c
Fix pivot offset calculation for negative incx
2022-01-17 00:08:20 +01:00
Martin Kroeker
eca2f50b48
Fix pivot offset calculation for negative incx
2022-01-17 00:07:33 +01:00
Martin Kroeker
0e9e951306
Fix pivot offset calculation for negative incx
2022-01-17 00:06:41 +01:00
Martin Kroeker
1b49ef8dcf
Fix pivot index for negative increments
2022-01-17 00:05:33 +01:00
Martin Kroeker
6b407a16cb
fix function typecasts
2021-12-21 18:51:28 +01:00
Martin Kroeker
aecb4a5e8d
fix function typecasts
2021-12-21 18:50:22 +01:00
Martin Kroeker
c49d46f25f
fix function typecast
2021-12-21 18:49:18 +01:00
gxw
af0a69f355
Add support for LOONGARCH64
2021-07-27 15:29:12 +08:00
Zhang Xianyi
d7ba7679b6
Merge branch 'develop' into risc-v
2020-10-16 23:27:38 +08:00
Martin Kroeker
4bb73c0171
Rename "HALF" type to "BFLOAT16"
2020-10-13 20:07:19 +02:00
Martin Kroeker
32733ded04
Rename "HALF" and "sh" to "BFLOAT16" and "sb"
2020-10-11 23:52:45 +02:00
Martin Kroeker
b27ca78a21
Adapt to having only a subset of variable types supported
2020-10-11 14:46:24 +02:00
Martin Kroeker
93454022a9
Adapt to having only a subset of variable types supported
2020-10-11 14:45:40 +02:00
Martin Kroeker
20cf1d773f
Adapt to having only a subset of variable types supported
2020-10-11 14:44:56 +02:00
Martin Kroeker
5c657fffad
Adapt to having only a subset of variable types supported
2020-10-11 14:44:13 +02:00
Martin Kroeker
b262058059
Adapt to having only a subset of variable types supported
2020-10-11 14:43:13 +02:00
Martin Kroeker
bc319cee82
Adapt to having only a subset of variable types supported
2020-10-11 14:42:26 +02:00
Martin Kroeker
e5966f8606
Adapt to having only a subset of variable types supported
2020-10-11 14:41:43 +02:00
Martin Kroeker
9df12eb08f
Adapt to having only a subset of variable types supported
2020-10-11 14:40:51 +02:00
Martin Kroeker
cf53970bcb
Adapt to having only a subset of variable types supported
2020-10-11 14:40:06 +02:00
Martin Kroeker
dcd51d5c72
Adapt to having only a subset of variable types supported
2020-10-11 14:39:19 +02:00
Martin Kroeker
b8f95354c7
Adapt to having only a subset of variable types supported
2020-10-11 14:38:25 +02:00
Martin Kroeker
f194ad59e1
Use _Atomic instead of volatile where available (file moved from ../getrf)
...
must have misplaced this in ../getrf when I made that change in March 2018 (40160ff )
the only changes since then were
RFC : Add half precision gemm for bfloat16 in OpenBLAS Rajalakshmi Srinivasaraghavan
Rajalakshmi Srinivasaraghavan committed on 14 Apr 2020 as 7ebbb50
Change _STDC_VERSION__ to __STDC_VERSION__
Zhiyong Dang committed on 11 May 2018 as 3716267
2020-07-25 08:52:24 +02:00
Martin Kroeker
4fda217f99
Delete potrf_parallel.c (moving it to ../potrf)
2020-07-25 06:42:39 +00:00
Martin Kroeker
bbe119ee3b
Update conditional for atomics to use HAVE_C11
2020-07-18 17:19:59 +00:00
Martin Kroeker
f4f74941bd
Update conditional for atomics to use HAVE_C11
2020-07-18 17:14:50 +00:00
Rajalakshmi Srinivasaraghavan
22bb50fb81
cmake fixes
2020-04-17 13:35:17 -05:00