lvfengchun

lvfengchun

AI & ML interests

None yet

Recent Activity

reacted to MonsterMMORPG's post with 👀 about 2 months ago
NVIDIA Labs developed SANA model weights and Gradio demo app published —Check out this amazing new Text to Image model by NVIDIA Official repo : https://github.com/NVlabs/Sana 1-Click Windows, RunPod, Massed Compute installers and free Kaggle notebook : https://www.patreon.com/posts/116474081 You can follow instructions on the repository to install and use locally. I tested on my Windows RTX 3060 and 3090 GPUs. I have tested some speeds and VRAM usage too Uses 9.5 GB VRAM but someone reported works good on 8 GB GPUs too Default settings per image speeds as below Free Kaggle Account Notebook on T4 GPU : 15 second RTX 3060 (12 GB) : 9.5 second RTX 3090 : 4 second RTX 4090 : 2 second More info : https://nvlabs.github.io/Sana/ Works great on RunPod and Massed Compute as well (cloud) Sana : Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer About Sana — Taken from official repo We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU. Core designs include: Deep compression autoencoder: unlike traditional AEs, which compress images only 8×, we trained an AE that can compress images 32×, effectively reducing the number of latent tokens. Linear DiT: we replace all vanilla attention in DiT with linear attention, which is more efficient at high resolutions without sacrificing quality. Decoder-only text encoder: we replaced T5 with modern decoder-only small LLM as the text encoder and designed complex human instruction with in-context learning to enhance the image-text alignment. Efficient training and sampling: we propose Flow-DPM-Solver to reduce sampling steps, with efficient caption labeling and selection to accelerate convergence.
updated a model 10 months ago
lvfengchun/chatglm_lora
View all activity

Organizations

Mars Republic's profile picture

lvfengchun's activity

reacted to MonsterMMORPG's post with 👀 about 2 months ago
view post
Post
778
NVIDIA Labs developed SANA model weights and Gradio demo app published —Check out this amazing new Text to Image model by NVIDIA

Official repo : https://github.com/NVlabs/Sana

1-Click Windows, RunPod, Massed Compute installers and free Kaggle notebook : https://www.patreon.com/posts/116474081

You can follow instructions on the repository to install and use locally. I tested on my Windows RTX 3060 and 3090 GPUs.

I have tested some speeds and VRAM usage too

Uses 9.5 GB VRAM but someone reported works good on 8 GB GPUs too

Default settings per image speeds as below

Free Kaggle Account Notebook on T4 GPU : 15 second
RTX 3060 (12 GB) : 9.5 second
RTX 3090 : 4 second
RTX 4090 : 2 second
More info : https://nvlabs.github.io/Sana/

Works great on RunPod and Massed Compute as well (cloud)

Sana : Efficient High-Resolution Image Synthesis
with Linear Diffusion Transformer

About Sana — Taken from official repo

We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU. Core designs include: Deep compression autoencoder: unlike traditional AEs, which compress images only 8×, we trained an AE that can compress images 32×, effectively reducing the number of latent tokens. Linear DiT: we replace all vanilla attention in DiT with linear attention, which is more efficient at high resolutions without sacrificing quality. Decoder-only text encoder: we replaced T5 with modern decoder-only small LLM as the text encoder and designed complex human instruction with in-context learning to enhance the image-text alignment. Efficient training and sampling: we propose Flow-DPM-Solver to reduce sampling steps, with efficient caption labeling and selection to accelerate convergence.