-
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
Paper • 2410.18603 • Published • 32 -
A Survey of Small Language Models
Paper • 2410.20011 • Published • 40 -
Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines
Paper • 2410.21220 • Published • 10 -
o1-Coder: an o1 Replication for Coding
Paper • 2412.00154 • Published • 43
Vilmos Bilicki
bilickiv
AI & ML interests
None yet
Organizations
None yet
Collections
4
-
Grounded-VideoLLM: Sharpening Fine-grained Temporal Grounding in Video Large Language Models
Paper • 2410.03290 • Published • 7 -
TAPTRv3: Spatial and Temporal Context Foster Robust Tracking of Any Point in Long Video
Paper • 2411.18671 • Published • 20 -
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Paper • 2412.00927 • Published • 26
spaces
1
models
None public yet
datasets
None public yet