triton-kernels
triton-kernels is a set of kernels that enable fast moe on different architectures. These kernels are compatible with different precision (e.g bf16, mxfp4)
Original code here https://github.com/triton-lang/triton/tree/main/python/triton_kernels
The current version is the following commit 7d0efaa7231661299284a603512fce4fa255e62c
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support