Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning
•
1
Multimodal AI, Agents
Om AI Lab is a passionate group building multimodal AI agents that reshape our work and life.
Open Agent Leaderboard
Find and highlight objects in images based on text descriptions
Process and answer questions about webpage videos
VLM-R1 model for Open-Vocabulary Object Detection