Collections
Discover the best community collections!
Collections including paper arxiv:2403.17008
-
U-Net: Convolutional Networks for Biomedical Image Segmentation
Paper ā¢ 1505.04597 ā¢ Published ā¢ 8 -
Image Segmentation using U-Net Architecture for Powder X-ray Diffraction Images
Paper ā¢ 2310.16186 ā¢ Published ā¢ 2 -
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
Paper ā¢ 1709.07330 ā¢ Published ā¢ 2 -
Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans
Paper ā¢ 1801.08599 ā¢ Published ā¢ 2
-
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Paper ā¢ 2403.06775 ā¢ Published ā¢ 3 -
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Paper ā¢ 2010.11929 ā¢ Published ā¢ 6 -
Data Incubation -- Synthesizing Missing Data for Handwriting Recognition
Paper ā¢ 2110.07040 ā¢ Published ā¢ 2 -
A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks
Paper ā¢ 1811.00056 ā¢ Published ā¢ 2
-
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Paper ā¢ 2403.04634 ā¢ Published ā¢ 14 -
StableDrag: Stable Dragging for Point-based Image Editing
Paper ā¢ 2403.04437 ā¢ Published ā¢ 25 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper ā¢ 2403.04642 ā¢ Published ā¢ 46 -
Yi: Open Foundation Models by 01.AI
Paper ā¢ 2403.04652 ā¢ Published ā¢ 62
-
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper ā¢ 2402.17177 ā¢ Published ā¢ 88 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper ā¢ 2402.17485 ā¢ Published ā¢ 189 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper ā¢ 2403.00522 ā¢ Published ā¢ 44 -
PixArt-Ī£: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
Paper ā¢ 2403.04692 ā¢ Published ā¢ 40
-
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper ā¢ 2402.17485 ā¢ Published ā¢ 189 -
VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior
Paper ā¢ 2312.01841 ā¢ Published ā¢ 1 -
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model
Paper ā¢ 2311.16498 ā¢ Published ā¢ 1 -
GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians
Paper ā¢ 2312.02134 ā¢ Published ā¢ 2
-
Training-Free Consistent Text-to-Image Generation
Paper ā¢ 2402.03286 ā¢ Published ā¢ 65 -
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation
Paper ā¢ 2402.04324 ā¢ Published ā¢ 23 -
Ī»-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Paper ā¢ 2402.05195 ā¢ Published ā¢ 18 -
FiT: Flexible Vision Transformer for Diffusion Model
Paper ā¢ 2402.12376 ā¢ Published ā¢ 48
-
Ī»-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space
Paper ā¢ 2402.05195 ā¢ Published ā¢ 18 -
DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization
Paper ā¢ 2402.09812 ā¢ Published ā¢ 12 -
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper ā¢ 2311.10093 ā¢ Published ā¢ 57 -
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models
Paper ā¢ 2403.13535 ā¢ Published ā¢ 22
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper ā¢ 2401.09048 ā¢ Published ā¢ 9 -
Improving fine-grained understanding in image-text pre-training
Paper ā¢ 2401.09865 ā¢ Published ā¢ 16 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper ā¢ 2401.10891 ā¢ Published ā¢ 59 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper ā¢ 2401.13627 ā¢ Published ā¢ 73
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper ā¢ 2306.07967 ā¢ Published ā¢ 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper ā¢ 2306.07954 ā¢ Published ā¢ 113 -
TryOnDiffusion: A Tale of Two UNets
Paper ā¢ 2306.08276 ā¢ Published ā¢ 73 -
Seeing the World through Your Eyes
Paper ā¢ 2306.09348 ā¢ Published ā¢ 33