From Pixels to Words -- Towards Native Vision-Language Primitives at Scale
Haiwen Diao
Paranioar
AI & ML interests
Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model
Recent Activity
upvoted a paper about 2 hours ago
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond updated a collection 5 days ago
SenseNova-U1 updated a collection 5 days ago
SenseNova-U1