-
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing
Paper • 2311.00571 • Published • 40 -
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents
Paper • 2311.05437 • Published • 47 -
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1 -
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants
Paper • 2310.00653 • Published • 3
Joy Rimchala
joytafty
AI & ML interests
NER
Organizations
Collections
3
models
3
datasets
6
joytafty/icdar2023vqabd-small-tables-val
Viewer
•
Updated
•
19
•
42
joytafty/icdar2023vqabd-small-tables-train
Viewer
•
Updated
•
244
•
34
joytafty/denoising-dirty-documents-test
Viewer
•
Updated
•
72
•
42
joytafty/denoising-dirty-documents-train
Viewer
•
Updated
•
144
•
44
joytafty/denoising-dirty-documents-trained_cleaned
Viewer
•
Updated
•
144
•
36
joytafty/denoising-dirty-documents-cleaned
Viewer
•
Updated
•
144
•
35