ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding Paper • 2603.27064 • Published 30 days ago • 27
view article Article Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents 26 days ago • 34
Granite Vision Collection Multimodal models built for visual document analysis and image understanding. • 7 items • Updated about 24 hours ago • 35
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models Paper • 2410.06154 • Published Oct 8, 2024 • 16
Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts Paper • 2406.12034 • Published Jun 17, 2024 • 16
$\textit{Trans-LoRA}$: towards data-free Transferable Parameter Efficient Finetuning Paper • 2405.17258 • Published May 27, 2024 • 16
LangNav: Language as a Perceptual Representation for Navigation Paper • 2310.07889 • Published Oct 11, 2023 • 6