Optimum Internal Testing

https://github.com/huggingface/optimum

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

echarlaix updated a model about 1 hour ago

optimum-internal-testing/tiny-random-SpeechT5ForTextToSpeech-openvino

echarlaix published a model about 2 hours ago

optimum-internal-testing/tiny-random-SpeechT5ForTextToSpeech-openvino

optimum-internal-testing-user updated a model 5 days ago

optimum-internal-testing/tiny_random_bert_neuronx

View all activity

optimum-internal-testing's activity

echarlaix

updated a model about 1 hour ago

optimum-internal-testing/tiny-random-SpeechT5ForTextToSpeech-openvino

Updated about 2 hours ago

echarlaix

published a model about 2 hours ago

optimum-internal-testing/tiny-random-SpeechT5ForTextToSpeech-openvino

Updated about 2 hours ago

optimum-internal-testing-user

updated a model 5 days ago

optimum-internal-testing/tiny_random_bert_neuronx

Feature Extraction • Updated 5 days ago • 945

michaelbenayoun

updated a model 5 days ago

optimum-internal-testing/optimum-neuron-cache-ci

Updated 5 days ago

optimum-internal-testing-user

published a model 5 days ago

optimum-internal-testing/optimum-neuron-cache-ci

Updated 5 days ago

sayakpaul

authored a paper 12 days ago

From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning

Paper • 2504.16080 • Published 13 days ago • 15

sayakpaul

authored a paper about 2 months ago

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published Mar 12 • 37

sayakpaul

posted an update 3 months ago

Post

3711

Inference-time scaling meets Flux.1-Dev (and others) 🔥

Presenting a simple re-implementation of "Inference-time scaling diffusion models beyond denoising steps" by Ma et al.

I did the simplest random search strategy, but results can potentially be improved with better-guided search methods.

Supports Gemini 2 Flash & Qwen2.5 as verifiers for "LLMGrading" 🤗

The steps are simple:

For each round:

1> Starting by sampling 2 starting noises with different seeds.
2> Score the generations w.r.t a metric.
3> Obtain the best generation from the current round.

If you have more compute budget, go to the next search round. Scale the noise pool (2 ** search_round) and repeat 1 - 3.

This constitutes the random search method as done in the paper by Google DeepMind.

Code, more results, and a bunch of other stuff are in the repository. Check it out here: https://github.com/sayakpaul/tt-scale-flux/ 🤗

regisss

posted an update 3 months ago

Post

1723

Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...

sayakpaul

posted an update 3 months ago

Post

2081

We have been cooking a couple of fine-tuning runs on CogVideoX with finetrainers, smol datasets, and LoRA to generate cool video effects like crushing, dissolving, etc.

We are also releasing a LoRA extraction utility from a fully fine-tuned checkpoint. I know that kind of stuff has existed since eternity, but the quality on video models was nothing short of spectacular. Below are some links:

* Models and datasets:

finetrainers
* finetrainers: https://github.com/a-r-r-o-w/finetrainers
* LoRA extraction: https://github.com/huggingface/diffusers/blob/main/scripts/extract_lora_from_model.py

1 reply

sayakpaul

posted an update 3 months ago

Post

2030

We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co/blog/video_gen

sayakpaul

posted an update 4 months ago

Post

4434

Commits speak louder than words 🤪

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release 🤗
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0

regisss

posted an update 5 months ago

Post

1031

Nice to see day 1 support of Falcon 3 on Gaudi with Optimum Habana!

👉 https://www.intel.com/content/www/us/en/developer/articles/technical/intel-ai-solutions-support-falcon-3-fdn-models.html

sayakpaul

posted an update 5 months ago

Post

2251

In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.

1 reply

sayakpaul

posted an update 5 months ago

Post

2181

Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences

7 replies

sayakpaul

posted an update 5 months ago

Post

2206

The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script 🤗

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130

sayakpaul

authored a paper 5 months ago

A Noise is Worth Diffusion Guidance

Paper • 2412.03895 • Published Dec 5, 2024 • 31

sayakpaul

posted an update 5 months ago

Post

1564

Let 2024 be the year of video model fine-tunes!

Check it out here:
https://github.com/a-r-r-o-w/cogvideox-factory/tree/main/training/mochi-1

sayakpaul

posted an update 6 months ago

Post

2720

It's been a while we shipped native quantization support in diffusers 🧨

We currently support bistandbytes as the official backend but using others like torchao is already very simple.

This post is just a reminder of what's possible:

1. Loading a model with a quantization config
2. Saving a model with quantization config
3. Loading a pre-quantized model
4. enable_model_cpu_offload()
5. Training and loading LoRAs into quantized checkpoints

Docs:
https://huggingface.co/docs/diffusers/main/en/quantization/bitsandbytes

1 reply

regisss

posted an update 7 months ago

Post

1424

Interested in performing inference with an ONNX model?⚡️

The Optimum docs about model inference with ONNX Runtime is now much clearer and simpler!

You want to deploy your favorite model on the hub but you don't know how to export it to the ONNX format? You can do it in one line of code as follows:

from optimum.onnxruntime import ORTModelForSequenceClassification

# Load the model from the hub and export it to the ONNX format
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

Check out the whole guide 👉 https://huggingface.co/docs/optimum/onnxruntime/usage_guides/models

AI & ML interests

Recent Activity

Team members 11

optimum-internal-testing's activity