Steffen Röcker's picture

Steffen Röcker PRO

sroecker

·

https://x.com/sroecker

AI & ML interests

Local models

Recent Activity

liked a model 39 minutes ago

dealignai/Qwen3.6-27B-MXFP4-CRACK

liked a model about 11 hours ago

HauhauCS/Qwen3.6-27B-Uncensored-HauhauCS-Aggressive

liked a dataset about 11 hours ago

NeuML/wikipedia-20260401

View all activity

Organizations

upvoted a paper 1 day ago

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Paper • 2411.17525 • Published Nov 26, 2024 • 6

upvoted a collection 1 day ago

HIGGS

Models prequantized with [HIGGS](https://arxiv.org/abs/2411.17525) zero-shot quantization. Requires the latest `transformers` to run. • 18 items • Updated Feb 18 • 15

upvoted an article 1 day ago

Article

AI and the Future of Cybersecurity: Why Openness Matters

+1

2 days ago

•

30

upvoted a collection 3 days ago

Kimi K2.5

Moonshot's most powerful model • 3 items • Updated 3 days ago • 62

upvoted a paper 6 days ago

TRACER: Trace-Based Adaptive Cost-Efficient Routing for LLM Classification

Paper • 2604.14531 • Published 7 days ago • 6

upvoted a collection 7 days ago

Ternary Bonsai

1.58-bit Bonsai models • 9 items • Updated 2 days ago • 68

upvoted an article 7 days ago

Article

The PR you would have opened yourself

7 days ago

•

59

upvoted a paper 7 days ago

BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation

Paper • 2604.09497 • Published 13 days ago • 29

upvoted 3 collections 8 days ago

Parcae

A family of stable looped models • 4 items • Updated 8 days ago • 4

HLWQ Unified (Weights Q5 + KV Cache Q3)

Full-stack HLWQ: Q5 weights + torchao INT4 + Q3 KV cache · formerly PolarQuant Unified • 16 items • Updated 5 days ago • 3

Rethink_SFT_generalization

Repo for paper Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability. • 40 items • Updated 12 days ago • 17

upvoted an article 9 days ago

Article

How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs

16 days ago

•

59

upvoted a changelog 16 days ago

Hugging Face Changelog

Agent Traces on the Hub

16 days ago

• 110

upvoted a collection 16 days ago

GLM-5.1

2 items • Updated 16 days ago • 60

upvoted a collection 20 days ago

Gemma 4

8 items • Updated 21 days ago • 671

upvoted a collection 27 days ago

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 3 days ago • 125

upvoted a collection 30 days ago

Open Pangram

Open models and datasets based on Pangram's ICLR 2026 EditLens paper licensed for noncommercial use ONLY under CC BY-NC-SA 4.0 • 4 items • Updated 30 days ago • 11

upvoted a collection about 1 month ago

CodeScout

RL-trained code search agents (1.7B, 4B, 14B) that outperform 2–18× larger models using only a Unix terminal. 📄 arxiv.org/abs/2603.17829 • 12 items • Updated Mar 19 • 7

upvoted an article about 1 month ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

+7

Mar 10

•

132

upvoted a collection about 1 month ago

Distil Efficiency Benchmarks

Collection of models used in the blog post www.distillabs.ai/blog/the-10x-inference-tax-you-dont-have-to-pay • 9 items • Updated Mar 2 • 3