Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
13.5
TFLOPS
380
52
197
Sayak Paul
sayakpaul
Follow
vicalum's profile picture
geekyrakshit's profile picture
Ganbayar's profile picture
672 followers
·
55 following
https://sayak.dev
RisingSayak
sayakpaul
AI & ML interests
Diffusion models, representation learning
Recent Activity
posted
an
update
about 16 hours ago
Fast LoRA inference for Flux with Diffusers and PEFT 🚨 There are great materials that demonstrate how to optimize inference for popular image generation models, such as Flux. However, very few cover how to serve LoRAs fast, despite LoRAs being an inseparable part of their adoption. In our latest post, @BenjaminB and I show different techniques to optimize LoRA inference for the Flux family of models for image generation. Our recipe includes the use of: 1. `torch.compile` 2. Flash Attention 3 (when compatible) 3. Dynamic FP8 weight quantization (when compatible) 4. Hotswapping for avoiding recompilation during swapping new LoRAs 🤯 We have tested our recipe with Flux.1-Dev on both H100 and RTX 4090. We achieve at least a *2x speedup* in either of the GPUs. We believe our recipe is grounded in the reality of how LoRA-based use cases are generally served. So, we hope this will be beneficial to the community 🤗 Even though our recipe was tested primarily with NVIDIA GPUs, it should also work with AMD GPUs. Learn the details and the full code here: https://huggingface.co/blog/lora-fast
commented
on
their
article
about 16 hours ago
Fast LoRA inference for Flux with Diffusers and PEFT
new
activity
about 24 hours ago
black-forest-labs/FLUX.1-Kontext-dev:
pipe.to("cuda") runs super slow, is it expected?
View all activity
Organizations
sayakpaul
's models
58
Sort: Recently updated
sayakpaul/different-lora-from-civitai
12B
•
Updated
Jun 18
•
18
•
1
sayakpaul/flux-diffusers-gguf
12B
•
Updated
Jun 10
•
37
•
1
sayakpaul/mini-t2v-verse-with-t5-embeddings
Updated
Mar 26
sayakpaul/trained-lumina2-lora-yarn
Text-to-Image
•
Updated
Feb 20
•
8
•
3
sayakpaul/vjepa-ckpts
Updated
Feb 17
sayakpaul/FLUX.1-dev-edit-v0
Text-to-Image
•
Updated
Jan 21
•
41
•
•
45
sayakpaul/cartoon-control-lr_1e-4-wd_1e-4-gs_10.0-cd_0.1
Text-to-Image
•
Updated
Jan 5
•
8
•
6
sayakpaul/q8-ltx-video
Updated
Jan 2
•
26
•
7
sayakpaul/yarn_art_lora_sana
Text-to-Image
•
Updated
Dec 16, 2024
•
13
•
1
sayakpaul/bnb-single-file-checkpoint-from-civitai
Updated
Dec 4, 2024
•
14
sayakpaul/mochi-lora-dissolve
Text-to-Video
•
Updated
Nov 29, 2024
•
4
•
2
sayakpaul/mochi-lora
Text-to-Video
•
Updated
Nov 29, 2024
•
8
•
3
sayakpaul/flux.1-dev-int8-aot-compiled
Updated
Oct 31, 2024
•
3
sayakpaul/sd35-large-nf4
Text-to-Image
•
Updated
Oct 27, 2024
•
5
sayakpaul/yarn_art_lora_flux_nf4
Text-to-Image
•
Updated
Oct 21, 2024
•
5
•
sayakpaul/FLUX.1-merged
Text-to-Image
•
Updated
Oct 8, 2024
•
491
•
205
sayakpaul/flux-lora-resizing
Updated
Sep 28, 2024
•
67
•
21
sayakpaul/flux.1-schell-int8wo-improved
Updated
Sep 5, 2024
•
199
•
7
sayakpaul/flux.1-dev-nf4-with-bnb-integration
Updated
Sep 1, 2024
•
62
•
18
sayakpaul/fp8-dog-lora-flux
Text-to-Image
•
Updated
Aug 22, 2024
•
16
•
sayakpaul/flux.1-dev-nf4
Updated
Aug 14, 2024
•
8.63k
•
26
sayakpaul/dpo-sdxl-text2image-v1-full
Text-to-Image
•
Updated
May 29, 2024
•
4
•
1
sayakpaul/actual_bigger_transformer
Updated
May 29, 2024
•
6
sayakpaul/gemma-2b-sft-qlora-no-robots
Updated
Mar 20, 2024
•
3
•
1
sayakpaul/mgie
Updated
Feb 19, 2024
•
14
•
8
sayakpaul/pixel_peft_model-new
Updated
Feb 16, 2024
sayakpaul/toy_peft_model-new
Updated
Feb 16, 2024
sayakpaul/tiny-sd-pipeline-for-single-file-testing
Text-to-Image
•
Updated
Feb 13, 2024
•
5
sayakpaul/sdxl-base-unet-1.0
Updated
Jan 11, 2024
•
1
sayakpaul/instruct-pix2pix-sdxl-emu
Updated
Dec 11, 2023
Previous
1
2
Next