Commit
·
8901aae
verified
·
0
Parent(s):
Super-squash branch 'main' using huggingface_hub
Browse files- .gitattributes +35 -0
- README.md +29 -0
- coverimg-1.webp +0 -0
- coverimg-2.webp +0 -0
- model.safetensors +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- nyanko7/danbooru2023
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
---
|
| 8 |
+
|
| 9 |
+

|
| 10 |
+
|
| 11 |
+
# nyaflow-xl [alpha]
|
| 12 |
+
|
| 13 |
+
This is an experiment to finetune Stable Diffusion XL model using the Flow Matching training objective.
|
| 14 |
+
|
| 15 |
+
Done in July 2024, based on [sdxl](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9), using several publicly available datasets. [[demo]](https://huggingface.co/spaces/nyanko7/toaru-xl-model)
|
| 16 |
+
|
| 17 |
+
## Model Details
|
| 18 |
+
|
| 19 |
+
Flow Matching generates a sample from the target data distribution by iteratively changing a sample from a prior distribution, e.g., Gaussian. The model is trained to predict the velocity V_t = \frac{dX_t}{dt}, which guides it to “move” the sample X_t in the direction of the sample X_1. As in prior work (Esser et al., 2024), we sample t from a logit-normal distribution where the underlying Gaussian distribution has zero mean and unit standard deviation, use the optimal transport path to construct X_t.
|
| 20 |
+
|
| 21 |
+
Our training dataset consists of 3.6M recaptioned/tagged image-text pairs, with filtering and processing for improved context and stability. Training was completed on a 32×H100 GPU cluster using deepspeed framework. (Thanks for the compute grant!)
|
| 22 |
+
|
| 23 |
+
We observe consistent improvements in both validation loss and evaluation performance with increased training steps and compute, due to limited training budget we had to cap the training duration at ~48 hours. Despite the constraints, we observed that the baseline sdxl model adapted well to the Flow Matching target.
|
| 24 |
+
|
| 25 |
+

|
| 26 |
+
|
| 27 |
+
The model supports concepts, styles, and detailed character rendering. It maintains semantic alignment for diverse prompts and complex inputs but performs not well with natural language inputs due to the limited amount of NL captions included in this training run. Furthermore, we found that the model may produce over saturated images and overfitting to some styles.
|
| 28 |
+
|
| 29 |
+
While nyaflow-xl demonstrates interesting results, it remains a prototype. Feel free to leave comment and criticism
|
coverimg-1.webp
ADDED
|
coverimg-2.webp
ADDED
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d35930d94bd20e56726f3b8c964d73c7c7dae825d6cd1b0d5b79bc0c1d43c2c7
|
| 3 |
+
size 7105349788
|