|
|
--- |
|
|
language: en |
|
|
library_name: optimum.neuron |
|
|
tags: |
|
|
- diffusion |
|
|
- image-generation |
|
|
- aws |
|
|
- neuronx |
|
|
- inf2 |
|
|
- flux |
|
|
- compiled |
|
|
- bfloat16 |
|
|
license: creativeml-openrail-m |
|
|
datasets: |
|
|
- n/a |
|
|
pipeline_tag: text-to-image |
|
|
base_model: Freepik/flux.1-lite-8B |
|
|
--- |
|
|
|
|
|
# Flux Lite 8B โ 1024ร1024 (Tensor Parallelism 4, AWS Inf2) |
|
|
|
|
|
๐ This repository contains the **compiled NeuronX graph** for running [Freepikโs Flux.1-Lite-8B](https://huggingface.co/Freepik/flux.1-lite-8B) model on **AWS Inferentia2 (Inf2)** instances, optimized for **1024ร1024 image generation** with **tensor parallelism = 4**. |
|
|
|
|
|
The model has been compiled using [๐ค Optimum Neuron](https://huggingface.co/docs/optimum/neuron/index) to leverage AWS NeuronCores for efficient inference at scale. |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ง Compilation Details |
|
|
- **Base model:** `Freepik/flux.1-lite-8B` |
|
|
- **Framework:** [optimum-neuron](https://github.com/huggingface/optimum-neuron) |
|
|
- **Tensor Parallelism:** `4` (splits model across 4 NeuronCores) |
|
|
- **Input resolution:** `1024 ร 1024` |
|
|
- **Batch size:** `1` |
|
|
- **Precision:** `bfloat16` |
|
|
- **Auto-casting:** disabled (`auto_cast="none"`) |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ฅ Installation |
|
|
|
|
|
Make sure you are running on an **AWS Inf2 instance** with the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/neuron-intro.html) installed. |
|
|
|
|
|
```bash |
|
|
pip install "optimum[neuron]" torch torchvision |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
|
|
|
# ๐ Usage |
|
|
|
|
|
from optimum.neuron import NeuronFluxPipeline |
|
|
|
|
|
# Load compiled pipeline from Hugging Face |
|
|
```bash |
|
|
pipe = NeuronFluxPipeline.from_pretrained( |
|
|
"kutayozbay/flux-lite-8B-1024x1024-tp4", |
|
|
device="neuron", # run on AWS Inf2 NeuronCores |
|
|
torch_dtype="bfloat16", |
|
|
batch_size=1, |
|
|
height=1024, |
|
|
width=1024, |
|
|
tensor_parallel_size=4, |
|
|
) |
|
|
``` |
|
|
|
|
|
# Generate an image |
|
|
|
|
|
```bash |
|
|
prompt = "A futuristic city skyline at sunset" |
|
|
image = pipe(prompt).images[0] |
|
|
image.save("flux_output.png") |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
# ๐ Re-compilation Example |
|
|
|
|
|
To compile this model yourself: |
|
|
|
|
|
```bash |
|
|
|
|
|
from optimum.neuron import NeuronFluxPipeline |
|
|
|
|
|
compiler_args = {"auto_cast": "none"} |
|
|
input_shapes = {"batch_size": 1, "height": 1024, "width": 1024} |
|
|
|
|
|
pipe = NeuronFluxPipeline.from_pretrained( |
|
|
"Freepik/flux.1-lite-8B", |
|
|
torch_dtype="bfloat16", |
|
|
export=True, |
|
|
tensor_parallel_size=4, |
|
|
**compiler_args, |
|
|
**input_shapes, |
|
|
) |
|
|
|
|
|
pipe.save_pretrained("flux_lite_neuronx_1024_tp4/") |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
--- |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|