Pavel Iakubovskii's picture

Pavel Iakubovskii

qubvel-hf

·

AI & ML interests

Computer Vision models

Recent Activity

liked a Space 2 days ago

tencent/Hunyuan3D-2.1

liked a Space 2 days ago

tencent/DepthCrafter

new activity 2 days ago

facebook/vjepa2-vitl-fpc64-256:Does this model have CLS token?

View all activity

Organizations

upvoted a paper 4 days ago

V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning

Paper • 2506.09985 • Published 15 days ago • 26

upvoted a collection 10 days ago

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated 13 days ago • 128

upvoted an article 22 days ago

Article

KV Cache from scratch in nanoVLM

By

and 4 others •

23 days ago

• 77

upvoted 2 changelogs about 1 month ago

Changelog

Xet is now the default storage option for new users and organizations

May 23

• 66

Changelog

Static Spaces can now have a build step

May 23

• 105

upvoted an article about 1 month ago

Article

The Transformers Library: standardizing model definitions

By

and 3 others •

May 15

• 114

upvoted an article about 2 months ago

Article

LeRobot Community Datasets: The “ImageNet” of Robotics — When and How?

By

and 6 others •

May 11

• 64

upvoted 2 collections about 2 months ago

D-FINE

State-of-the-art real-time object detection model with Apache 2.0 licence • 15 items • Updated May 5 • 55

Qwen3

72 items • Updated 11 days ago • 804

upvoted an article 2 months ago

Article

Welcome to Inference Providers on the Hub 🔥

By

and 6 others •

Jan 28

• 484

upvoted 2 collections 2 months ago

Describe Anything

Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated about 2 hours ago • 52

DPT

Vision Transformers for Dense Prediction. • 1 item • Updated Apr 12 • 1

upvoted 2 articles 2 months ago

Article

Gotchas in Tokenizer Behavior Every Developer Should Know

By

•

Apr 18

• 37

Article

Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques

By

and 8 others •

Mar 24

• 18

upvoted 3 collections 2 months ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 27 days ago • 201

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory. • 19 items • Updated Apr 18 • 27

Sa2VA Model Zoo

Huggingace Model Zoo For Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos By Bytedance Seed CV Research • 4 items • Updated Feb 9 • 37

upvoted a paper 2 months ago

Pixel-SAIL: Single Transformer For Pixel-Grounded Understanding

Paper • 2504.10465 • Published Apr 14 • 28

upvoted a collection 2 months ago

UPerNet

Unified Perceptual Parsing for Scene Understanding • 8 items • Updated Apr 12 • 1

upvoted a paper 3 months ago

MedSAM2: Segment Anything in 3D Medical Images and Videos

Paper • 2504.03600 • Published Apr 4 • 9