Vec-to-Cube Pattern【免费下载链接】cannbot-skillsCANNBot 是面向 CANN 开发的用于提升开发效率的系列智能体本仓库为其提供可复用的 Skills 模块。项目地址: https://gitcode.com/cann/cannbot-skillsGeneric baseline only. For a2 (b3) kernels, prefer the a2-specific patterns underagent/references/patterns/(e.g.,a2-cube-vec.md) and readagent/references/constraints/a2-device.mdfor device-side rules.Read this file when vec work preprocesses data before cube consumes it in a later matmul stage.Use this pattern whenthe formula needs elementwise or row-wise preprocessing firstthe cube stage should consume the transformed resultthe host-side contract should stay reshape-only instead of doing a heavy layout transform outside the kernelMinimal flowGM - UB - vf - UB - L1 - L0 - L0C - GMOwnership ruleThe vec-to-cube publish is a cross-side ownership edge. Use explicitVcMutex. Do not expectauto_sync()to replace it.Stable repository mapping:VcMutex(..., src_end_pipePipe.MTE3, dst_end_pipePipe.FIX)What usually matters mostwhether the publish path is ND or NZwhether the host-side layout stays reshape-onlyhow subblock rows are split between vec sideswhether the preprocessed value must remain in half or float before cube consumeTypical files to studyagent/example/kernels/a5/vec_cube_abs_sqrt_matmul.pyagent/example/kernels/a5/vec_cube_abs_sqrt_matmul_nz.pyagent/example/kernels/a5/recompute_wu_cube_vec.py【免费下载链接】cannbot-skillsCANNBot 是面向 CANN 开发的用于提升开发效率的系列智能体本仓库为其提供可复用的 Skills 模块。项目地址: https://gitcode.com/cann/cannbot-skills创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考