COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning Paper • 2504.21850 • Published 3 days ago • 24
Unifying Specialized Visual Encoders for Video Language Models Paper • 2501.01426 • Published Jan 2 • 21