vggt
VGGT (CVPR 2025)
VGGT (CVPR 2025)
Wan2.1-T2V-14B + Fast 4-step with NAG + Automatic Audio
Generate images from text prompts
Swap faces in images and enhance them if desired
Transcribe audio or YouTube videos into text
Search and submit code models for evaluation
Request evaluation for a speech model
Expressive Zeroshot TTS
Generate a custom song from lyrics and prompts
Audio-Driven Multi-Person Conversational Video Generation
Upscale and enhance images with Real-ESRGAN
High-fidelity Virtual Try-on
260+ impressive lora's for flux.1
Upscale an image to higher resolution
Generate audio from video or text prompts
Discussions about the Inference Providers feature on the Hub
Speedy and Accurate Image to 3D Generator
A demo showcasing a medical learning experience of CXR image
Display and download evaluation data for coding tasks
Compare original and improved OCR text from historical documents
Generate personalized images with a face preservation
VLMEvalKit Evaluation Results Collection
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Extend images by infilling and resizing