mirror of
https://github.com/OpenMathLib/OpenBLAS
synced 2026-05-31 00:45:48 +08:00
I went through the wiki pages and found two pages with edits that weren't reflected in the html docs yet, so syncing that content here.
2.4 KiB
2.4 KiB
OpenBLAS for the most part contains implementations of the reference (Netlib) BLAS, CBLAS, LAPACK and LAPACKE interfaces. A few OpenBLAS-specific functions are also provided however, which mostly can be seen as "BLAS extensions". This page documents those non-standard APIs.
BLAS-like extensions
| Routine | Data Types | Description |
|---|---|---|
| ?axpby | s,d,c,z | like axpy with a multiplier for y |
| ?gemm3m | c,z | gemm3m |
| ?imatcopy | s,d,c,z | in-place transposition/copying |
| ?omatcopy | s,d,c,z | out-of-place transposition/copying |
| ?geadd | s,d,c,z | ATLAS-like matrix add B = α*A+β*B |
| ?gemmt | s,d,c,z | gemm but only a triangular part updated |
bfloat16 functionality
BLAS-like and conversion functions for bfloat16 (available when OpenBLAS was compiled with BUILD_BFLOAT16=1):
void cblas_sbstobf16converts a float array to an array of bfloat16 values by roundingvoid cblas_sbdtobf16converts a double array to an array of bfloat16 values by roundingvoid cblas_sbf16tosconverts a bfloat16 array to an array of floatsvoid cblas_dbf16todconverts a bfloat16 array to an array of doublesfloat cblas_sbdotcomputes the dot product of two bfloat16 arraysvoid cblas_sbgemvperforms the matrix-vector operations of GEMV with the input matrix and X vector as bfloat16void cblas_sbgemmperforms the matrix-matrix operations of GEMM with both input arrays containing bfloat16
Utility functions
openblas_get_num_threadsopenblas_set_num_threadsint openblas_get_num_procs(void)returns the number of processors available on the system (may include "hyperthreading cores")int openblas_get_parallel(void)returns 0 for sequential use, 1 for platform-based threading and 2 for OpenMP-based threadingchar * openblas_get_config()returns the options OpenBLAS was built with, something likeNO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswellint openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset)sets the CPU affinity mask of the given thread to the provided cpuset. Only available on Linux, with semantics identical topthread_setaffinity_np.