Prithiv Sakthi's picture

Building on HF

Prithiv Sakthi PRO

prithivMLmods

hugging-science

·

https://linktr.ee/prithivsakthi

AI & ML interests

computer vision, nlp, multimodality - HuggingFace Fellow🤗

Recent Activity

updated a model about 2 hours ago

strangeropshf/test-vl-mini

published a model about 2 hours ago

strangeropshf/test-vl-mini

updated a Space about 4 hours ago

prithivMLmods/VLM-Parsing

View all activity

Organizations

upvoted a collection 1 day ago

Sapiens2

26 items • Updated 3 days ago • 15

upvoted a paper 3 days ago

WorldMark: A Unified Benchmark Suite for Interactive Video World Models

Paper • 2604.21686 • Published 5 days ago • 36

upvoted a collection 3 days ago

Garnet OCR

Collection of Garnet OCR Models • 4 items • Updated about 9 hours ago • 1

upvoted a collection 4 days ago

DeepSeek-V4

4 items • Updated 4 days ago • 547

upvoted a collection 5 days ago

Qwen3.6-27B Compressions

Collection of various compressed versions of the Qwen3.6-27B native multimodal model. • 7 items • Updated about 9 hours ago • 2

upvoted a collection 6 days ago

WildDet3D

This is the collection of WildDet3D artifacts, including demos, model checkpoints and data. https://github.com/allenai/WildDet3D • 8 items • Updated 14 days ago • 17

upvoted an article 6 days ago

Article

Building a Fast Multilingual OCR Model with Synthetic Data

10 days ago

•

30

upvoted 2 papers 8 days ago

KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

Paper • 2604.13226 • Published 14 days ago • 10

HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System

Paper • 2604.14125 • Published 13 days ago • 20

upvoted a changelog 8 days ago

Hugging Face Changelog

Spaces agents.md for your coding agents

10 days ago

• 158

upvoted a collection 9 days ago

Qwen3.6-35B-A3B Compressions

Collection of various compressed versions of the Qwen3.6-35B-A3B native multimodal model. • 4 items • Updated about 9 hours ago • 2

upvoted 3 papers 9 days ago

GlobalSplat: Efficient Feed-Forward 3D Gaussian Splatting via Global Scene Tokens

Paper • 2604.15284 • Published 12 days ago • 24

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published 13 days ago • 115

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Paper • 2604.15308 • Published 12 days ago • 29

upvoted a changelog 11 days ago

Hugging Face Changelog

Introducing Kernels

12 days ago

• 152

upvoted an article 13 days ago

Article

Stop benchmarking inference providers

13 days ago

•

8

upvoted 2 papers 15 days ago

WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published 19 days ago • 241

EXAONE 4.5 Technical Report

Paper • 2604.08644 • Published 19 days ago • 66

upvoted 2 papers 16 days ago

MolmoWeb: Open Visual Web Agent and Open Data for the Open Web

Paper • 2604.08516 • Published 19 days ago • 42

HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents

Paper • 2604.07430 • Published 20 days ago • 187