nvidia/parakeet-tdt-0.6b-v2 Automatic Speech Recognition โข Updated about 13 hours ago โข 965k โข 1.16k
view post Post 4783 Researchers developed Sonic AI enabling precise facial animation from speech cues ๐ง Decouples head/expression control via audio tone analysis + time-aware fusion for natural long-form synthesis See translation 1 reply ยท ๐ 8 8 ๐ฅ 6 6 ๐ 2 2 ๐ง 1 1 + Reply
Do generative video models learn physical principles from watching videos? Paper โข 2501.09038 โข Published Jan 14 โข 35
ZePo: Zero-Shot Portrait Stylization with Faster Sampling Paper โข 2408.05492 โข Published Aug 10, 2024 โข 7
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper โข 2408.06941 โข Published Aug 13, 2024 โข 33
UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization Paper โข 2408.05939 โข Published Aug 12, 2024 โข 15
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Paper โข 2408.06072 โข Published Aug 12, 2024 โข 40
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper โข 2408.06070 โข Published Aug 12, 2024 โข 54
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper โข 2408.06292 โข Published Aug 12, 2024 โข 126