Joao Gante

joaogante

https://github.com/gante

AI & ML interests

None yet

Recent Activity

new activity 4 days ago

google/diffusiongemma-26B-A4B-it:Update non-thinking chat template

new activity 4 days ago

google/diffusiongemma-26B-A4B-it:RuntimeError: Tensor.item() cannot be called on meta tensors

new activity 11 days ago

google/diffusiongemma-26B-A4B-it:Argument input_ids not found in the forward method

View all activity

Organizations

Posts 4

Post

882

Let's go! Custom generation code has landed in transformers 🚀

Have you designed a new cool KV cache? Maybe you're comparing new test-time compute ideas you've been researching? Have you found a way to do diffusion with existing models? You can now easily share your findings with the community with custom generation code, sharing the well-known generate interface 🤓

In a nutshell, we have expanded the support of custom modeling code on the Hub with *model-agnostic* custom generation code. Write for one model, reuse with any model -- hopefully, this will democratize access to new generation ideas 🫡

As a creator, you gain the ability to get your ideas in transformers with minimal effort. You'll also have access to all Hub features: a landing page for your creation, discussions, usage metrics, ... 🤓

💎 Resources 💎
- docs: https://huggingface.co/docs/transformers/generation_strategies#custom-decoding-methods
- minimal example: transformers-community/custom_generate_example
- discussion: transformers-community/support#10