Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models
Abstract
We introduce Edify Image, a family of diffusion models capable of generating photorealistic image content with pixel-perfect accuracy. Edify Image utilizes cascaded pixel-space diffusion models trained using a novel Laplacian diffusion process, in which image signals at different frequency bands are attenuated at varying rates. Edify Image supports a wide range of applications, including text-to-image synthesis, 4K upsampling, ControlNets, 360 HDR panorama generation, and finetuning for image customization.
Community
Edify Image can generate photorealistic high-resolution images from text prompts. It supports a range of capabilities, including (a) Text-to-image generation, (b) Finetuning, (c) Generation with additional control, and (d) Panorama generation.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling (2024)
- Multi-Scale Diffusion: Enhancing Spatial Layout in High-Resolution Panoramic Image Generation (2024)
- DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation (2024)
- JVID: Joint Video-Image Diffusion for Visual-Quality and Temporal-Consistency in Video Generation (2024)
- Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper