Inst-IT

university

https://inst-it.github.io/

inst-it

Activity Feed

AI & ML interests

Large Multimodal Models

Recent Activity

wjpoom authored a paper about 1 month ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

wjpoom updated a dataset about 2 months ago

Inst-IT/Inst-It-Bench

wjpoom updated a dataset about 2 months ago

Inst-IT/Inst-It-Dataset

View all activity

Inst-IT's activity

wjpoom

authored a paper about 1 month ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24 • 30

wjpoom

updated 2 datasets about 2 months ago

Inst-IT/Inst-It-Bench

Viewer • Updated Mar 3 • 4.07k • 83 • 1

Inst-IT/Inst-It-Dataset

Viewer • Updated Mar 1 • 72.5k • 352 • 7

wjpoom

updated a Space 2 months ago

README

🐨

Boosting Multimodal Understanding at Instance-Level

wjpoom

published a Space 2 months ago

README

🐨

Boosting Multimodal Understanding at Instance-Level

wjpoom

updated a collection 2 months ago

Inst-IT Models

Collection

A series of LMMs finetuned with the Inst-IT Dataset, skilled in fine-grained image/video understanding at the instance-level. • 2 items • Updated Mar 17

wjpoom

updated 2 models 2 months ago

Inst-IT/LLaVA-Next-Inst-It-Qwen2-7B

Video-Text-to-Text • Updated Feb 21 • 9 • 3

Inst-IT/LLaVA-Next-Inst-It-Vicuna-7B

Video-Text-to-Text • Updated Feb 20 • 11 • 2

menglc

authored a paper 4 months ago

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Paper • 2412.03565 • Published Dec 4, 2024 • 11

wjpoom

authored 2 papers 5 months ago

Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning

Paper • 2412.03565 • Published Dec 4, 2024 • 11

Synthesize, Diagnose, and Optimize: Towards Fine-Grained Vision-Language Understanding

Paper • 2312.00081 • Published Nov 30, 2023 • 2

menglc

authored 2 papers 10 months ago

SEGIC: Unleashing the Emergent Correspondence for In-Context Segmentation

Paper • 2311.14671 • Published Nov 24, 2023

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

Paper • 2406.04334 • Published Jun 6, 2024

menglc

authored a paper over 1 year ago

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Paper • 2311.07574 • Published Nov 13, 2023 • 16

AI & ML interests

Recent Activity

Team members 2

Inst-IT's activity

README

README