dinhanhx

dinhanhx

·

dinhanhx

AI & ML interests

Vision Language

Recent Activity

liked a model 9 days ago

Qwen/Qwen3.5-9B

upvoted a collection 11 days ago

liked a model 12 days ago

opendatalab/MinerU2.5-Pro-2605-1.2B

View all activity

Organizations

upvoted a collection 11 days ago

LingBot-Vision

5 items • Updated 17 days ago • 11

upvoted a collection about 2 months ago

Gemma 4 QAT Q4_0

19 items • Updated 2 days ago • 148

upvoted a paper 3 months ago

SAM3-LiteText: An Anatomical Study of the SAM3 Text Encoder for Efficient Vision-Language Segmentation

Paper • 2602.12173 • Published Feb 12 • 3

upvoted an article 3 months ago

Article

Running Native PyTorch on TPUs with Zero Code Changes

rishiraj

•

Feb 21

• 6

upvoted a paper 4 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 167

upvoted an article 4 months ago

Article

Welcome Gemma 4: Frontier multimodal intelligence on device

+5

merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift

•

Apr 2

• 919

upvoted 2 collections 4 months ago

jina-embeddings-v5-text

Our 5th-gen embeddings: two lightweight multilingual models with SOTA performance in retrieval, matching, clustering, and classification. • 29 items • Updated Feb 27 • 41

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 23 items • Updated 7 days ago • 337

upvoted an article 6 months ago

Article

ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases

QuentinJG

•

Nov 5, 2025

• 67

upvoted a collection 6 months ago

Contextual AI Reranker v2

Family of instruction-following multilingual rerankers on the cost/performance Pareto frontier across public and customer benchmarks • 9 items • Updated Apr 23 • 12

upvoted a paper 6 months ago

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

Paper • 2202.13669 • Published Feb 28, 2022 • 3

upvoted an article 6 months ago

Article

How We Built a Semantic Highlight Model To Save Token Cost for RAG

zilliz

•

Jan 15

• 67

upvoted a paper 7 months ago

NV-Retriever: Improving text embedding models with effective hard-negative mining

Paper • 2407.15831 • Published Jul 22, 2024 • 5

upvoted a collection 7 months ago

Pixio

5 items • Updated Dec 19, 2025 • 16

upvoted a collection 8 months ago

DRAMA

A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26, 2025 • 9

upvoted an article 8 months ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

+5

orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova

•

Feb 20, 2025

• 344

upvoted a collection 8 months ago

MetaCLIP2 Multilingual

8 items • Updated Nov 12, 2025 • 16

upvoted 2 collections 9 months ago

Meta CLIP 1

Scaling CLIP data with transparent training distribution from an end-to-end pipeline. • 7 items • Updated Nov 24, 2025 • 23

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 15 items • Updated Mar 10 • 707

upvoted an article 9 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

+1

tomaarsen, Xenova, osanseviero

•

Feb 23, 2024

• 212