kernels-community

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

danieldk updated a model about 15 hours ago

kernels-community/paged-attention

danieldk updated a model about 15 hours ago

kernels-community/mamba-ssm

danieldk updated a model about 17 hours ago

kernels-community/punica-sgmv

View all activity

danieldk

updated 2 models about 15 hours ago

kernels-community/paged-attention

Updated 7 days ago • 2

kernels-community/mamba-ssm

Updated about 15 hours ago

danieldk

updated a model about 17 hours ago

kernels-community/punica-sgmv

Updated about 17 hours ago

danieldk

updated 5 models about 18 hours ago

danieldk

updated 2 models 3 days ago

RedHatAI/quantization

Updated 3 days ago • 5

RedHatAI/moe

Updated 4 days ago • 1

drbh

updated a model 5 days ago

RedHatAI/moe

Updated 4 days ago • 1

drbh

updated a model 5 days ago

kernels-community/megablocks

Updated 5 days ago • 1

EricB

updated a model 6 days ago

kernels-community/metal-flash-sdpa

Updated 6 days ago • 1

drbh

updated a model 8 days ago

RedHatAI/quantization

Updated 3 days ago • 5

drbh

updated a model 8 days ago

kernels-community/activation

Updated 8 days ago • 4

danieldk

posted an update 14 days ago

Post

1802

kernels 0.8.0 is out: https://github.com/huggingface/kernels/releases/tag/v0.8.0

This release refines kernel selection in the kernelize function:

• You can now register kernels for certain CUDA capability ranges.
• Rather than doing exact mating of modes, fall back to other compatible modes. If you are kernelizing for inference, but you only registered a training + torch.compile kernel, it will use that kernel since it is compatible with inference as well.

1 reply

danieldk

posted an update 18 days ago

Post

311

You can get flash-attention 3 ⚡️ directly from the hub now using kernels!

kernels-community/flash-attn3

danieldk

posted an update 18 days ago

Post

267

Kernels 0.7.0 is out: https://github.com/huggingface/kernels/releases/tag/v0.7.0 🚀

This release makes it possible to register multiple kernels for a layer. Do you have a super-fast kernel for inference and another kernel for training? Register them both and kernelize will pick the kernel depending on whether you are going to do training or inference.

Narsil

posted an update about 2 months ago

Post

1767

Me: This function is too slow. Find a faster algorithm.
Cursor: Hold my beer.

Me: *Slacking off with colleagues*
Cursor: Ping.

Me: 🤯

danieldk

posted an update about 2 months ago

Post

1737

We have been working on a project called kernels. kernels makes it possible to load compute kernels directly from the Hub! 🚀

We plan to give kernels a more proper introduction soon. But for those who have been following along, we are happy to announce a new release:

- New layer API with torch.compile support.
- Experimental support for loading Apple Silicon Metal 🤘 Kernels.
- Generate wheels from Hub kernels for legacy deployments.

Full release notes here: https://github.com/huggingface/kernels/releases/tag/v0.6.0

AI & ML interests

Recent Activity

Team members 9

kernels-community's activity