From Pixels to Words -- Towards Native One-Vision Models at Scale
Haiwen Diao
Paranioar
AI & ML interests
Vision-and-Language, Parameter-efficient Transfer Learning, Multi-modal Large Language Model
Recent Activity
upvoted a paper about 2 hours ago
Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion authored a paper 20 days ago
From Pixels to Words -- Towards Native One-Vision Models at Scale commentedon a paper 20 days ago
From Pixels to Words -- Towards Native One-Vision Models at Scale