Optimum Internal Testing

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

optimum-internal-testing's activity

sayakpaulΒ 
posted an update 4 days ago
view post
Post
2707
Inference-time scaling meets Flux.1-Dev (and others) πŸ”₯

Presenting a simple re-implementation of "Inference-time scaling diffusion models beyond denoising steps" by Ma et al.

I did the simplest random search strategy, but results can potentially be improved with better-guided search methods.

Supports Gemini 2 Flash & Qwen2.5 as verifiers for "LLMGrading" πŸ€—

The steps are simple:

For each round:

1> Starting by sampling 2 starting noises with different seeds.
2> Score the generations w.r.t a metric.
3> Obtain the best generation from the current round.

If you have more compute budget, go to the next search round. Scale the noise pool (2 ** search_round) and repeat 1 - 3.

This constitutes the random search method as done in the paper by Google DeepMind.

Code, more results, and a bunch of other stuff are in the repository. Check it out here: https://github.com/sayakpaul/tt-scale-flux/ πŸ€—
regisssΒ 
posted an update 7 days ago
view post
Post
1593
Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...
sayakpaulΒ 
posted an update 22 days ago
view post
Post
1960
We have been cooking a couple of fine-tuning runs on CogVideoX with finetrainers, smol datasets, and LoRA to generate cool video effects like crushing, dissolving, etc.

We are also releasing a LoRA extraction utility from a fully fine-tuned checkpoint. I know that kind of stuff has existed since eternity, but the quality on video models was nothing short of spectacular. Below are some links:

* Models and datasets: https://huggingface.co/finetrainers
* finetrainers: https://github.com/a-r-r-o-w/finetrainers
* LoRA extraction: https://github.com/huggingface/diffusers/blob/main/scripts/extract_lora_from_model.py
  • 1 reply
Β·
sayakpaulΒ 
posted an update 25 days ago
view post
Post
1935
We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more πŸ”₯

5-6GBs for HunyuanVideo, sky is the limit 🌌 πŸ€—
https://huggingface.co/blog/video_gen
sayakpaulΒ 
posted an update about 2 months ago
view post
Post
4349
Commits speak louder than words πŸ€ͺ

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release πŸ€—
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0
regisssΒ 
posted an update 2 months ago
sayakpaulΒ 
posted an update 2 months ago
view post
Post
2182
In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.
  • 1 reply
Β·
sayakpaulΒ 
posted an update 2 months ago
view post
Post
2128
Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences
  • 7 replies
Β·
sayakpaulΒ 
posted an update 2 months ago
view post
Post
2160
The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script πŸ€—

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130