yano
AI & ML interests
Recent Activity
Organizations
Promising model

decent RP model!

Good model!

v2 - thoughts


Optimum v1.27 marks the final major release in the v1 series. As we close this chapter, we're laying the groundwork for a more modular and community-driven future:
- Optimum v2: A lightweight core package for porting Transformers, Diffusers, or Sentence-Transformers to specialized AI hardware/software/accelerators..
- OptimumβONNX: A dedicated package where the ONNX/ONNX Runtime ecosystem lives and evolves, faster-moving and decoupled from the Optimum core.
π― Why this matters:
- A clearer governance path for ONNX, fostering stronger community collaboration and improved developer experience..
- Enable innovation at a faster pace in a more modular, open-source environment.
π‘ What this means:
- More transparency, broader participation, and faster development driven by the community and key actors in the ONNX ecosystem (PyTorch, Microsoft, Joshua Lochner π, ...)
- A cleaner, more maintainable core Optimum, focused on extending HF libraries to special AI hardware/software/accelerators tooling and used by our partners (Intel Corporation, Amazon Web Services (AWS), AMD, NVIDIA, FuriosaAI, ...)
π οΈ Major updates I worked on in this release:
β Added support for Transformers v4.53 and SmolLM3 in ONNX/ONNXRuntime.
β Solved batched inference/generation for all supported decoder model architectures (LLMs).
β¨ Big shoutout to @echarlaix for leading the refactoring work that cleanly separated ONNX exporter logic and enabled the creation of OptimumβONNX.
π Release Notes: https://lnkd.in/gXtE_qji
π¦ Optimum : https://lnkd.in/ecAezNT6
π Optimum-ONNX: https://lnkd.in/gzjyAjSi
#Optimum #ONNX #OpenSource #HuggingFace #Transformers #Diffusers
Nice!

RTX A6000, lists 48Gb video memory...
Soooo jealous....
dont be its ampere .. i have 2 of them .. i much rather have 2 6000 pro nowadays .. much much faster
Jealouss.........
RTX A6000, lists 48Gb video memory...
Soooo jealous....
Working?

If only every model's description page was as detailed as yours.

671B to 2.7T?!? I can barely run 235B models! (and that's with Q3 or Q2).
"Ultra" is the keywords here. We are going to release Pro and Mini models.
The downside is it isn't very well supported by many apps
Gotcha. I'll keep fingers crossed then
Yes! It has gone from 671B to 2.7T in slightly more than 2 weeks.
671B to 2.7T?!? I can barely run 235B models! (and that's with Q3 or Q2).
Regardless, as long as it performs well and does the job i suppose.
But i wonder if trimming the models or getting them to be more optimized in size vs performance shouldn't be a bigger push. Though i'm new to this scene so i could just be ignorant in how this is all done.
decent model
