Dcas89's picture

2 4 13

Dcas89 PRO

Dcas89

·

Dcas89

AI & ML interests

None yet

Recent Activity

reacted to sergiopaniego's post with 🤗 about 1 month ago

ICYMI, transformers v5 is out! Grab a coffee ☕ and go read the announcement blog https://huggingface.co/blog/transformers-v5

reacted to Jofthomas's post with 🔥 about 1 month ago

The new Mistral 3 models are here ! Today, we announce Mistral 3, the next generation of Mistral models. Mistral 3 includes three state-of-the-art small, dense models (14B, 8B, and 3B) and Mistral Large 3 – our most capable model to date – a sparse mixture-of-experts trained with 41B active and 675B total parameters. All models are released under the Apache 2.0 license. Ministrals : https://huggingface.co/collections/mistralai/ministral-3 Mistral Large 3: https://huggingface.co/collections/mistralai/mistral-large-3

liked a dataset about 2 months ago

minwoosun/CholecSeg8k

View all activity

Organizations

None yet

upvoted a paper 6 months ago

KV Cache Steering for Inducing Reasoning in Small Language Models

Paper • 2507.08799 • Published Jul 11, 2025 • 40

upvoted an article 8 months ago

Article

I trained a Language Model to schedule events with GRPO!

Apr 29, 2025

•

91

upvoted a paper 9 months ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31, 2025 • 124

upvoted an article over 1 year ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

+1

Aug 14, 2024

•

74