Spaces:

tom-doerr
/

logo_generator

Runtime error

App Files Files Community

boris commited on Apr 9, 2022

Commit

648305a

1 Parent(s): 07a6f9a

doc: reference dalle playground

Browse files

Files changed (1) hide show

README.md +35 -39

README.md CHANGED Viewed

@@ -19,21 +19,25 @@ _Generate images from a text prompt_
 Our logo was generated with DALL·E mini using the prompt "logo of an armchair in the shape of an avocado".
-You can create your own pictures with [the demo](https://huggingface.co/spaces/dalle-mini/dalle-mini).
-## How does it work?
-Refer to [our report](https://wandb.ai/dalle-mini/dalle-mini/reports/DALL-E-mini--Vmlldzo4NjIxODA).
-## Inference Pipeline
-To generate sample predictions and understand the inference pipeline step by step, refer to [`tools/inference/inference_pipeline.ipynb`](tools/inference/inference_pipeline.ipynb).
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/borisdayma/dalle-mini/blob/main/tools/inference/inference_pipeline.ipynb)
 ## Contributing
-Join the community on the [DALLE-Pytorch Discord](https://discord.gg/xBPBXfcFHd).
 Any contribution is welcome, from reporting issues to proposing fixes/improvements or testing the model with cool prompts!
 ## Development
@@ -45,14 +49,6 @@ For inference only, use `pip install git+https://github.com/borisdayma/dalle-min
 For development, clone the repo and use `pip install -e ".[dev]"`.
 Before making a PR, check style with `make style`.
-### Image Encoder
-We use a VQGAN from [taming-transformers](https://github.com/CompVis/taming-transformers), which can also be fine-tuned.
-Use [patil-suraj/vqgan-jax](https://github.com/patil-suraj/vqgan-jax) if you want to convert a checkpoint to JAX (does not support Gumbel).
-Any image encoder that turns an image into a fixed sequence of tokens can be used.
 ### Training of DALL·E mini
 Use [`tools/train/train.py`](tools/train/train.py).
@@ -65,8 +61,8 @@ You can also adjust the [sweep configuration file](https://docs.wandb.ai/guides/
 Trained models are on 🤗 Model Hub:
-- [VQGAN-f16-16384](https://huggingface.co/dalle-mini/vqgan_imagenet_f16_16384) for encoding/decoding images
-- [DALL·E mini](https://huggingface.co/flax-community/dalle-mini) for generating images from a text prompt
 ### Where does the logo come from?
@@ -74,29 +70,29 @@ The "armchair in the shape of an avocado" was used by OpenAI when releasing DALL
 ## Acknowledgements
-- 🤗 Hugging Face for organizing [the FLAX/JAX community week](https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects)
-- Google [TPU Research Cloud (TRC) program](https://sites.research.google/trc/) for providing computing resources
-- [Weights & Biases](https://wandb.com/) for providing the infrastructure for experiment tracking and model management
 ## Authors & Contributors
 DALL·E mini was initially developed by:
-- [Boris Dayma](https://github.com/borisdayma)
-- [Suraj Patil](https://github.com/patil-suraj)
-- [Pedro Cuenca](https://github.com/pcuenca)
-- [Khalid Saifullah](https://github.com/khalidsaifullaah)
-- [Tanishq Abraham](https://github.com/tmabraham)
-- [Phúc Lê Khắc](https://github.com/lkhphuc)
-- [Luke Melas](https://github.com/lukemelas)
-- [Ritobrata Ghosh](https://github.com/ghosh-r)
 Many thanks to the people who helped make it better:
-- the [DALLE-Pytorch](https://discord.gg/xBPBXfcFHd) and [EleutherAI](https://www.eleuther.ai/) communities for testing and exchanging cool ideas
-- [Rohan Anil](https://github.com/rohan-anil) for adding Distributed Shampoo optimizer
-- [Phil Wang](https://github.com/lucidrains) has provided a lot of cool implementations of transformer variants and gives interesting insights with [x-transformers](https://github.com/lucidrains/x-transformers)
-- [Katherine Crowson](https://github.com/crowsonkb) for [super conditioning](https://twitter.com/RiversHaveWings/status/1478093658716966912)
 ## Citing DALL·E mini
@@ -121,13 +117,13 @@ Image encoder from "[Taming Transformers for High-Resolution Image Synthesis](ht
 Sequence to sequence model based on "[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461v1)" with implementation of a few variants:
-- "[GLU Variants Improve Transformer](https://arxiv.org/abs/2002.05202)"
-- "[Deepnet: Scaling Transformers to 1,000 Layers](https://arxiv.org/abs/2203.00555)"
-- "[NormFormer: Improved Transformer Pretraining with Extra Normalization](https://arxiv.org/abs/2110.09456)"
-- "[Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)"
-- "[CogView: Mastering Text-to-Image Generation via Transformers](https://arxiv.org/abs/2105.13290v2)"
-- "[Root Mean Square Layer Normalization](https://arxiv.org/abs/1910.07467)"
-- "[Sinkformers: Transformers with Doubly Stochastic Attention](https://arxiv.org/abs/2110.11773)"
 Main optimizer (Distributed Shampoo) from "[Scalable Second Order Optimization for Deep Learning](https://arxiv.org/abs/2002.09018)".

 Our logo was generated with DALL·E mini using the prompt "logo of an armchair in the shape of an avocado".
+## How to use it?
+There are several ways to use DALL·E mini to create your own images:
+* use [the official DALL·E Mini demo](https://huggingface.co/spaces/dalle-mini/dalle-mini)
+* experiment with the pipeline step by step through our [`inference pipeline notebook`](tools/inference/inference_pipeline.ipynb)
+  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/borisdayma/dalle-mini/blob/main/tools/inference/inference_pipeline.ipynb)
+* spin off your own app with [DALL-E Playground repository](https://github.com/saharmor/dalle-playground) (thanks [Sahar](https://twitter.com/theaievangelist))
+## How does it work?
+Refer to [our report](https://wandb.ai/dalle-mini/dalle-mini/reports/DALL-E-mini--Vmlldzo4NjIxODA).
 ## Contributing
+Join the community on the [LAION Discord](https://discord.gg/xBPBXfcFHd).
 Any contribution is welcome, from reporting issues to proposing fixes/improvements or testing the model with cool prompts!
 ## Development
 For development, clone the repo and use `pip install -e ".[dev]"`.
 Before making a PR, check style with `make style`.
 ### Training of DALL·E mini
 Use [`tools/train/train.py`](tools/train/train.py).
 Trained models are on 🤗 Model Hub:
+* [VQGAN-f16-16384](https://huggingface.co/dalle-mini/vqgan_imagenet_f16_16384) for encoding/decoding images
+* [DALL·E mini](https://huggingface.co/flax-community/dalle-mini) for generating images from a text prompt
 ### Where does the logo come from?
 ## Acknowledgements
+* 🤗 Hugging Face for organizing [the FLAX/JAX community week](https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects)
+* Google [TPU Research Cloud (TRC) program](https://sites.research.google/trc/) for providing computing resources
+* [Weights & Biases](https://wandb.com/) for providing the infrastructure for experiment tracking and model management
 ## Authors & Contributors
 DALL·E mini was initially developed by:
+* [Boris Dayma](https://github.com/borisdayma)
+* [Suraj Patil](https://github.com/patil-suraj)
+* [Pedro Cuenca](https://github.com/pcuenca)
+* [Khalid Saifullah](https://github.com/khalidsaifullaah)
+* [Tanishq Abraham](https://github.com/tmabraham)
+* [Phúc Lê Khắc](https://github.com/lkhphuc)
+* [Luke Melas](https://github.com/lukemelas)
+* [Ritobrata Ghosh](https://github.com/ghosh-r)
 Many thanks to the people who helped make it better:
+* the [DALLE-Pytorch](https://discord.gg/xBPBXfcFHd) and [EleutherAI](https://www.eleuther.ai/) communities for testing and exchanging cool ideas
+* [Rohan Anil](https://github.com/rohan-anil) for adding Distributed Shampoo optimizer
+* [Phil Wang](https://github.com/lucidrains) has provided a lot of cool implementations of transformer variants and gives interesting insights with [x-transformers](https://github.com/lucidrains/x-transformers)
+* [Katherine Crowson](https://github.com/crowsonkb) for [super conditioning](https://twitter.com/RiversHaveWings/status/1478093658716966912)
 ## Citing DALL·E mini
 Sequence to sequence model based on "[BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension](https://arxiv.org/abs/1910.13461v1)" with implementation of a few variants:
+* "[GLU Variants Improve Transformer](https://arxiv.org/abs/2002.05202)"
+* "[Deepnet: Scaling Transformers to 1,000 Layers](https://arxiv.org/abs/2203.00555)"
+* "[NormFormer: Improved Transformer Pretraining with Extra Normalization](https://arxiv.org/abs/2110.09456)"
+* "[Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/abs/2103.14030)"
+* "[CogView: Mastering Text-to-Image Generation via Transformers](https://arxiv.org/abs/2105.13290v2)"
+* "[Root Mean Square Layer Normalization](https://arxiv.org/abs/1910.07467)"
+* "[Sinkformers: Transformers with Doubly Stochastic Attention](https://arxiv.org/abs/2110.11773)"
 Main optimizer (Distributed Shampoo) from "[Scalable Second Order Optimization for Deep Learning](https://arxiv.org/abs/2002.09018)".