SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 1 day ago • 84
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 6 days ago • 129
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 17 days ago • 188
Reward-Guided Speculative Decoding for Efficient LLM Reasoning Paper • 2501.19324 • Published 22 days ago • 37
Optimizing Large Language Model Training Using FP4 Quantization Paper • 2501.17116 • Published 25 days ago • 35
Temporal Preference Optimization Collection Temporal Preference Optimization for Long-form Video Understanding • 3 items • Updated Jan 19 • 4
Temporal Preference Optimization for Long-Form Video Understanding Paper • 2501.13919 • Published 30 days ago • 22
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published about 1 month ago • 327
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published about 1 month ago • 83
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published Jan 16 • 25