Files
OpenBLAS/kernel
Dayuxiaoshui bd45b82ed0 Optimize RISC-V RVV omatcopy_ct implementation with advanced vectorization
- Implement block-based memory access optimization (64x64 blocks)
- Add 4-way loop unrolling to reduce loop overhead
- Optimize VSETVL calls to improve vectorization efficiency
- Add software prefetching for better memory access patterns
- Implement fast path for small matrices (<64x64)
- Add cross-compilation script for RISC-V testing
- Improve boundary handling with separate main/tail loops

Co-authored-by: gong-flying <gongxiaofei24@iscas.ac.cn>
2025-09-11 20:01:39 +08:00
..
2025-06-13 13:37:15 +02:00
2025-05-25 14:47:06 -07:00
2025-07-15 14:48:57 +01:00
2025-07-15 14:48:57 +01:00