-
One-Minute Video Generation with Test-Time Training
Paper • 2504.05298 • Published • 106 -
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper • 2503.23307 • Published • 136 -
Towards Understanding Camera Motions in Any Video
Paper • 2504.15376 • Published • 157 -
Antidistillation Sampling
Paper • 2504.13146 • Published • 61
Collections
Discover the best community collections!
Collections including paper arxiv:2409.17115
-
Automated Design of Agentic Systems
Paper • 2408.08435 • Published • 41 -
On the limits of agency in agent-based models
Paper • 2409.10568 • Published • 14 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 14 -
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
Paper • 2409.07703 • Published • 68
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 149 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 91 -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper • 2409.02634 • Published • 98 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 116
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 35 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 28 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 23
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 119 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43
-
One-Minute Video Generation with Test-Time Training
Paper • 2504.05298 • Published • 106 -
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper • 2503.23307 • Published • 136 -
Towards Understanding Camera Motions in Any Video
Paper • 2504.15376 • Published • 157 -
Antidistillation Sampling
Paper • 2504.13146 • Published • 61
-
Qwen2.5-Coder Technical Report
Paper • 2409.12186 • Published • 149 -
Attention Heads of Large Language Models: A Survey
Paper • 2409.03752 • Published • 91 -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
Paper • 2409.02634 • Published • 98 -
OmniGen: Unified Image Generation
Paper • 2409.11340 • Published • 116
-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 35 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 28 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 126 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 23
-
Automated Design of Agentic Systems
Paper • 2408.08435 • Published • 41 -
On the limits of agency in agent-based models
Paper • 2409.10568 • Published • 14 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 14 -
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
Paper • 2409.07703 • Published • 68
-
VILA^2: VILA Augmented VILA
Paper • 2407.17453 • Published • 42 -
Octopus v4: Graph of language models
Paper • 2404.19296 • Published • 119 -
Octo-planner: On-device Language Model for Planner-Action Agents
Paper • 2406.18082 • Published • 49 -
Dolphin: Long Context as a New Modality for Energy-Efficient On-Device Language Models
Paper • 2408.15518 • Published • 43