HunyuanVideo Community

AI & ML interests

None defined yet.

Recent Activity

a-r-r-o-w new activity 28 days ago

hunyuanvideo-community/HunyuanVideo-I2V:RuntimeError: The size of tensor a (7) must match the size of tensor b (16) at non-singleton dimension 1

a-r-r-o-w updated a model 29 days ago

hunyuanvideo-community/HunyuanVideo-I2V-33ch

a-r-r-o-w updated a model 29 days ago

hunyuanvideo-community/HunyuanVideo-I2V

View all activity

hunyuanvideo-community's activity

a-r-r-o-w

in hunyuanvideo-community/HunyuanVideo-I2V 28 days ago

RuntimeError: The size of tensor a (7) must match the size of tensor b (16) at non-singleton dimension 1

#1 opened 29 days ago by

DsnTgr

a-r-r-o-w

updated 2 models 29 days ago

hunyuanvideo-community/HunyuanVideo-I2V-33ch

Updated 29 days ago • 70 • 6

hunyuanvideo-community/HunyuanVideo-I2V

Image-to-Video • Updated 29 days ago • 638 • 2

a-r-r-o-w

published a model 29 days ago

hunyuanvideo-community/HunyuanVideo-I2V

Image-to-Video • Updated 29 days ago • 638 • 2

a-r-r-o-w

in hunyuanvideo-community/HunyuanVideo-I2V-33ch about 1 month ago

ValueError: Image features and image tokens do not match: tokens: 1, features 576

#2 opened about 1 month ago by

DsnTgr

sayakpaul

authored a paper about 1 month ago

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Paper • 2503.09641 • Published Mar 12 • 36

a-r-r-o-w

in hunyuanvideo-community/HunyuanVideo-I2V-33ch about 1 month ago

https://github.com/huggingface/diffusers/pull/10983#issuecomment-2712782698

#1 opened about 1 month ago by

tolgacangoz

a-r-r-o-w

published a model about 1 month ago

hunyuanvideo-community/HunyuanVideo-I2V-33ch

Updated 29 days ago • 70 • 6

sayakpaul

posted an update about 2 months ago

Post

3531

Inference-time scaling meets Flux.1-Dev (and others) 🔥

Presenting a simple re-implementation of "Inference-time scaling diffusion models beyond denoising steps" by Ma et al.

I did the simplest random search strategy, but results can potentially be improved with better-guided search methods.

Supports Gemini 2 Flash & Qwen2.5 as verifiers for "LLMGrading" 🤗

The steps are simple:

For each round:

1> Starting by sampling 2 starting noises with different seeds.
2> Score the generations w.r.t a metric.
3> Obtain the best generation from the current round.

If you have more compute budget, go to the next search round. Scale the noise pool (2 ** search_round) and repeat 1 - 3.

This constitutes the random search method as done in the paper by Google DeepMind.

Code, more results, and a bunch of other stuff are in the repository. Check it out here: https://github.com/sayakpaul/tt-scale-flux/ 🤗

sayakpaul

posted an update 3 months ago

Post

2066

We have been cooking a couple of fine-tuning runs on CogVideoX with finetrainers, smol datasets, and LoRA to generate cool video effects like crushing, dissolving, etc.

We are also releasing a LoRA extraction utility from a fully fine-tuned checkpoint. I know that kind of stuff has existed since eternity, but the quality on video models was nothing short of spectacular. Below are some links:

* Models and datasets:

finetrainers
* finetrainers: https://github.com/a-r-r-o-w/finetrainers
* LoRA extraction: https://github.com/huggingface/diffusers/blob/main/scripts/extract_lora_from_model.py

1 reply

sayakpaul

posted an update 3 months ago

Post

2016

We have authored a post to go over the state of video generation in the Diffusers ecosystem 🧨

We cover the models supported, the knobs of optims our users can fire, fine-tuning, and more 🔥

5-6GBs for HunyuanVideo, sky is the limit 🌌 🤗
https://huggingface.co/blog/video_gen

a-r-r-o-w

in hunyuanvideo-community/HunyuanVideo 3 months ago

No cfg?

#2 opened 3 months ago by

wang0422

a-r-r-o-w

in hunyuanvideo-community/HunyuanVideo 4 months ago

Black output

#1 opened 4 months ago by

Shuaishuai0219

sayakpaul

in hunyuanvideo-community/HunyuanVideo 4 months ago

Black output

#1 opened 4 months ago by

Shuaishuai0219

sayakpaul

posted an update 4 months ago

Post

4419

Commits speak louder than words 🤪

* 4 new video models
* Multiple image models, including SANA & Flux Control
* New quantizers -> GGUF & TorchAO
* New training scripts

Enjoy this holiday-special Diffusers release 🤗
Notes: https://github.com/huggingface/diffusers/releases/tag/v0.32.0

a-r-r-o-w

updated a model 4 months ago

hunyuanvideo-community/HunyuanVideo

Updated Dec 22, 2024 • 24.2k • 28

sayakpaul

posted an update 4 months ago

Post

2240

In the past seven days, the Diffusers team has shipped:

1. Two new video models
2. One new image model
3. Two new quantization backends
4. Three new fine-tuning scripts
5. Multiple fixes and library QoL improvements

Coffee on me if someone can guess 1 - 4 correctly.

1 reply

sayakpaul

posted an update 4 months ago

Post

2172

Introducing a high-quality open-preference dataset to further this line of research for image generation.

Despite being such an inseparable component for modern image generation, open preference datasets are a rarity!

So, we decided to work on one with the community!

Check it out here:
https://huggingface.co/blog/image-preferences

7 replies

sayakpaul

posted an update 4 months ago

Post

2198

The Control family of Flux from @black-forest-labs should be discussed more!

It enables structural controls like ControlNets while being significantly less expensive to run!

So, we're working on a Control LoRA training script 🤗

It's still WIP, so go easy:
https://github.com/huggingface/diffusers/pull/10130

sayakpaul

authored a paper 4 months ago

A Noise is Worth Diffusion Guidance

Paper • 2412.03895 • Published Dec 5, 2024 • 31

AI & ML interests

Recent Activity

Team members 5

hunyuanvideo-community's activity

RuntimeError: The size of tensor a (7) must match the size of tensor b (16) at non-singleton dimension 1

ValueError: Image features and image tokens do not match: tokens: 1, features 576

https://github.com/huggingface/diffusers/pull/10983#issuecomment-2712782698

No cfg?

Black output

Black output