nyanko7 commited on
Commit
8901aae
·
verified ·
0 Parent(s):

Super-squash branch 'main' using huggingface_hub

Browse files
Files changed (5) hide show
  1. .gitattributes +35 -0
  2. README.md +29 -0
  3. coverimg-1.webp +0 -0
  4. coverimg-2.webp +0 -0
  5. model.safetensors +3 -0
.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - nyanko7/danbooru2023
5
+ language:
6
+ - en
7
+ ---
8
+
9
+ ![](./coverimg-1.webp)
10
+
11
+ # nyaflow-xl [alpha]
12
+
13
+ This is an experiment to finetune Stable Diffusion XL model using the Flow Matching training objective.
14
+
15
+ Done in July 2024, based on [sdxl](https://huggingface.co/stabilityai/stable-diffusion-xl-base-0.9), using several publicly available datasets. [[demo]](https://huggingface.co/spaces/nyanko7/toaru-xl-model)
16
+
17
+ ## Model Details
18
+
19
+ Flow Matching generates a sample from the target data distribution by iteratively changing a sample from a prior distribution, e.g., Gaussian. The model is trained to predict the velocity V_t = \frac{dX_t}{dt}, which guides it to “move” the sample X_t in the direction of the sample X_1. As in prior work (Esser et al., 2024), we sample t from a logit-normal distribution where the underlying Gaussian distribution has zero mean and unit standard deviation, use the optimal transport path to construct X_t.
20
+
21
+ Our training dataset consists of 3.6M recaptioned/tagged image-text pairs, with filtering and processing for improved context and stability. Training was completed on a 32×H100 GPU cluster using deepspeed framework. (Thanks for the compute grant!)
22
+
23
+ We observe consistent improvements in both validation loss and evaluation performance with increased training steps and compute, due to limited training budget we had to cap the training duration at ~48 hours. Despite the constraints, we observed that the baseline sdxl model adapted well to the Flow Matching target.
24
+
25
+ ![](./coverimg-2.webp)
26
+
27
+ The model supports concepts, styles, and detailed character rendering. It maintains semantic alignment for diverse prompts and complex inputs but performs not well with natural language inputs due to the limited amount of NL captions included in this training run. Furthermore, we found that the model may produce over saturated images and overfitting to some styles.
28
+
29
+ While nyaflow-xl demonstrates interesting results, it remains a prototype. Feel free to leave comment and criticism
coverimg-1.webp ADDED
coverimg-2.webp ADDED
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d35930d94bd20e56726f3b8c964d73c7c7dae825d6cd1b0d5b79bc0c1d43c2c7
3
+ size 7105349788