20 22 53

Tony Wu

tonywu71

https://tonywu71.notion.site/Hi-I-m-Tony-e937d2baf5ab4669904b04fd24513499?pvs=74

AI & ML interests

RAG, LLMs, ASR

Recent Activity

updated a model about 7 hours ago

vidore/colqwen2-v1.0-hf-internal

upvoted a paper about 19 hours ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

upvoted a paper 8 days ago

SmolVLM: Redefining small and efficient multimodal models

View all activity

Organizations

tonywu71's activity

updated a model about 7 hours ago

vidore/colqwen2-v1.0-hf-internal

Updated about 7 hours ago • 9

upvoted a paper about 19 hours ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 1 day ago • 172

upvoted a paper 8 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 8 days ago • 158

liked a Space 9 days ago

720

Open ASR Leaderboard

🏆

Request evaluation for new speech models

published a model about 1 month ago

vidore/colqwen2-v1.0-hf

Visual Document Retrieval • Updated Mar 9 • 15

updated 7 models about 1 month ago

liked a model about 1 month ago

Qwen/Qwen2.5-7B-Instruct

Text Generation • Updated Jan 12 • 2.72M • • 635

liked a Space about 1 month ago

4.29k

Chatbot Arena Leaderboard

🏆

Display chatbot leaderboard and statistics

liked a Space about 2 months ago

5.4k

MTEB Leaderboard

🥇

Embedding Leaderboard

upvoted a paper about 2 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 142

upvoted an article about 2 months ago

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 149

updated a model about 2 months ago

vidore/colqwen2.5-v0.2

Visual Document Retrieval • Updated 25 days ago • 11.2k • 16