Commit Graph

8919 Commits

Author SHA1 Message Date
Martin Kroeker
fff2e214ca Add LAPACK-TEST errors topic 2024-12-30 23:05:17 +01:00
Martin Kroeker
718fb73bd8 Merge pull request #4976 from martin-frbg/m3m_exprec
[WIP]Add better workaround for GEMM3M on GENERIC and re-enable EXPRECISION for x86/x86_64 targets
2024-12-30 18:55:21 +01:00
Martin Kroeker
73527aab3c Merge pull request #5030 from tingboliao/develop
Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256.
2024-12-30 16:02:46 +01:00
Martin Kroeker
c1258662db Merge branch 'OpenMathLib:develop' into m3m_exprec 2024-12-30 15:58:15 +01:00
Martin Kroeker
36b0fb3aff Merge pull request #5035 from martin-frbg/issue4396
Improve OpenBLASConfig.cmake contents in gmake builds
2024-12-30 09:34:33 +01:00
Martin Kroeker
d863dcf83c Merge pull request #5033 from rgommers/doc-port-last-wiki-edits
docs: update extensions and install pages with last wiki edits
2024-12-30 00:30:08 +01:00
Martin Kroeker
d5e255519e Improve OpenBLASConfig.cmake contents 2024-12-29 22:38:23 +01:00
Ralf Gommers
df42f79c4c docs: update extensions and install pages with last wiki edits
I went through the wiki pages and found two pages with edits that
weren't reflected in the html docs yet, so syncing that content here.
2024-12-26 21:10:32 +01:00
Martin Kroeker
17803e7901 Merge pull request #5031 from david-cortes/fix_doc_links
Fix invalid link to FAQ
2024-12-24 23:20:48 +01:00
david-cortes
762fa1afa9 fix link to faq 2024-12-24 19:48:04 +01:00
Martin Kroeker
6af4e76f31 Merge pull request #5029 from martin-frbg/issue5020
Add support for compiling with Intel oneAPI 2025.0 on MS Windows
2024-12-24 16:10:20 +01:00
Martin Kroeker
fbf594b62f Guard against empty CMAKE_Fortran_COMPILER_ID 2024-12-24 13:34:33 +01:00
tingbo.liao
c4c3d9e68a Merge remote-tracking branch 'refs/remotes/origin/develop' into develop 2024-12-24 10:36:53 +08:00
tingbo.liao
0bea1cfd9d Optimize the zgemm_tcopy_4_rvv function to be compatible with the situations where the vector lengths(vlens) are 128 and 256.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
2024-12-24 10:33:27 +08:00
Martin Kroeker
e6fd629770 Expressly declare the .S extension for assembly (documented as standard, but current cmake does not set it for icx) 2024-12-23 23:18:52 +01:00
Martin Kroeker
05fe49ddaf Rename local copy functions to avoid name clash with the standard BLAS ones 2024-12-23 19:12:17 +01:00
Martin Kroeker
64c6c79201 Assume no underline suffixes on symbols when compiling with Intel ifx on Windows 2024-12-23 19:09:34 +01:00
Martin Kroeker
5c9417d306 Assume no underline suffixes on symbols when compiling with ifx on Windows 2024-12-23 19:07:39 +01:00
Martin Kroeker
5d81e514e4 Assume no underline suffixes on symbols when compiling with ifx on Windows 2024-12-23 19:06:03 +01:00
Martin Kroeker
d78fbe425c Assume no underline suffixes on symbols when compiling with ifx on Windows 2024-12-23 19:04:50 +01:00
Martin Kroeker
30188a55d1 Don't assume underlined symbols for ifx; make cpuid.S inclusion conditional 2024-12-23 19:02:34 +01:00
Martin Kroeker
32319a33ac Add options for Intel oneAPI 2025.0 ifx on Windows 2024-12-23 19:00:48 +01:00
Martin Kroeker
37a4ca7e46 Merge pull request #5025 from martin-frbg/nvidia_arm64
Add target-specific options to enable ARM64 SVE with the NVIDIA compiler
2024-12-19 23:30:53 -08:00
Martin Kroeker
1c4401ebf1 Add target-specific options to enable SVE with the NVIDIA compiler 2024-12-19 14:32:24 -08:00
Martin Kroeker
f2be482d43 Merge pull request #5024 from martin-frbg/issue5001
Improve the wording of the build instructions for Windows on Arm in the docs
2024-12-18 23:05:20 -08:00
Martin Kroeker
70dddacb9f Merge pull request #5023 from rgommers/fix-warnings
Fix two compiler warnings in `memory.c`
2024-12-18 16:13:12 -08:00
Martin Kroeker
a93d3db34a fix formatting of WoA section 2024-12-19 00:53:10 +01:00
Martin Kroeker
e460512685 Update WoA build instructions from rewording in issue #5001 2024-12-19 00:50:37 +01:00
Martin Kroeker
d3cc8c65ed Merge pull request #5022 from tingboliao/develop
Replace the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv…_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling.
2024-12-18 14:29:39 -08:00
Ralf Gommers
765ad8bcd2 Fix guard around alloc_hugetlb, fixes compile warning
The warning was:
```
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: At top level:
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:2565:14: warning: 'alloc_hugetlb' defined but not used [-Wunused-function]
 2565 | static void *alloc_hugetlb(void *address){
      |              ^~~~~~~~~~~~~
```

The added define is the same as is already present in the TLS part of
`memory.c`. This follows up on gh-4681.
2024-12-18 09:42:05 +01:00
Ralf Gommers
48caf2303d Fix build warning about discarding volatile qualifier in memory.c
The warning was:
```
[4339/5327] Building C object driver/others/CMakeFiles/driver_others.dir/memory.c.o
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: In function 'blas_shutdown':
/home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:3257:10: warning: passing argument 1 of 'free' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers]
 3257 |     free(newmemory);
      |          ^~~~~~~~~
In file included from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/common.h:83,
                 from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:74:
/home/rgommers/code/pixi-dev-scipystack/openblas/.pixi/envs/default/x86_64-conda-linux-gnu/sysroot/usr/include/stdlib.h:482:25: note: expected 'void *' but argument is of type 'volatile struct newmemstruct *'
  482 | extern void free (void *__ptr) __THROW;
      |                   ~~~~~~^~~~~
```

The use of `volatile` for `newmemstruct` seems on purpose, and there are
more such constructs in this file. The warning appeared after gh-4451
and is correct. The `free` prototype doesn't expect a volatile pointer,
hence this change adds a cast to silence the warning.
2024-12-18 08:53:29 +01:00
tingbo.liao
d00cc400b1 Replaced the __riscv_vid_v_i32m2 and __riscv_vid_v_i64m2 with __riscv_vid_v_u32m2 and __riscv_vid_v_u64m2 for riscv64-unknown-linux-gnu-gcc compiling.
Signed-off-by: tingbo.liao <tingbo.liao@starfivetech.com>
2024-12-18 08:38:30 +08:00
Martin Kroeker
229d8a025e Merge pull request #4959 from CDAC-Bengaluru/level-1-sve
SVE Implementation for Level-1 BLAS Routines
2024-12-13 05:20:51 -08:00
SushilPratap04
3368a4e697 Update swap_kernel_sve.c 2024-12-13 16:47:58 +05:30
CDAC-SSDG
dd71e4234a Added Updated swap and rot sve kernels. 2024-12-13 11:15:29 +05:30
CDAC-SSDG
06ffd411a5 Update KERNEL.ARMV8SVE 2024-12-13 11:05:47 +05:30
CDAC-SSDG
41912f9c22 Update CONTRIBUTORS.md 2024-12-13 11:05:10 +05:30
CDAC-SSDG
765850194e Delete kernel/arm64/swap_kernel_sve.c 2024-12-13 11:02:01 +05:30
CDAC-SSDG
c17c19fbcf Delete kernel/arm64/swap_kernel_c.c 2024-12-13 11:01:46 +05:30
CDAC-SSDG
f6416c0e37 Delete kernel/arm64/swap.c 2024-12-13 11:01:32 +05:30
CDAC-SSDG
3b7b74664c Delete kernel/arm64/scal_kernel_sve.c 2024-12-13 11:01:03 +05:30
CDAC-SSDG
95a97012e8 Delete kernel/arm64/scal_kernel_c.c 2024-12-13 11:00:45 +05:30
CDAC-SSDG
5540f2121e Delete kernel/arm64/scal.c 2024-12-13 11:00:12 +05:30
CDAC-SSDG
f62519cc87 Delete kernel/arm64/rot_kernel_sve.c 2024-12-13 10:59:35 +05:30
CDAC-SSDG
10857c9df4 Delete kernel/arm64/rot_kernel_c.c 2024-12-13 10:58:51 +05:30
CDAC-SSDG
b9f51a5cf7 Delete kernel/arm64/rot.c 2024-12-13 10:58:06 +05:30
Martin Kroeker
89f02ed394 Merge pull request #5014 from martin-frbg/issue5013
Add some missed lapack 3.11+ symbols to gensymbol
2024-12-10 23:09:33 -08:00
Martin Kroeker
61d5aec7c1 remove typo 2024-12-11 00:41:56 +01:00
Martin Kroeker
5aea097df0 add missing lapack 3.11+ symbols 2024-12-10 23:52:05 +01:00
Martin Kroeker
72f7b7011c Merge pull request #5009 from martin-frbg/pybenchdoc
DOCS, pybench : Add build notes for Windows and flang from gh Discussion 5008
2024-12-06 02:50:14 -08:00