4 616 484

r PRO

oceansweep

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

upvoted a paper 1 day ago

RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models

upvoted a paper 8 days ago

End-to-End Test-Time Training for Long Context

View all activity

Organizations

None yet

upvoted 2 papers 1 day ago

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Paper • 2601.01592 • Published 5 days ago • 11

RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models

Paper • 2601.03699 • Published 2 days ago • 5

upvoted 3 papers 8 days ago

End-to-End Test-Time Training for Long Context

Paper • 2512.23675 • Published 11 days ago • 16

UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement

Paper • 2512.21185 • Published 16 days ago • 28

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published 9 days ago • 230

liked a model 11 days ago

inclusionAI/TwinFlow-Z-Image-Turbo

Text-to-Image • Updated 11 days ago • 473 • 200

upvoted a paper 17 days ago

Are We on the Right Way to Assessing LLM-as-a-Judge?

Paper • 2512.16041 • Published 22 days ago • 32

liked a model 20 days ago

XiaomiMiMo/MiMo-V2-Flash

Text Generation • 310B • Updated 22 days ago • 31.4k • • 557

liked a model 22 days ago

YatharthS/MiraTTS

Text-to-Speech • 0.5B • Updated 16 days ago • 6.72k • 176

upvoted 2 papers 23 days ago

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Paper • 2512.12967 • Published 25 days ago • 103

Memory in the Age of AI Agents

Paper • 2512.13564 • Published 25 days ago • 133

liked a model 25 days ago

ResembleAI/chatterbox-turbo

Text-to-Speech • Updated 25 days ago • 537

reacted to Molbap's post with 🔥 29 days ago

Post

3320

🚀 New blog: Maintain the unmaintainable – 1M+ Python LOC, 400+ models

How do you stop a million-line library built by thousands of contributors from collapsing under its own weight?
At 🤗 Transformers, we do it with explicit software-engineering tenets, principles that make the codebase hackable at scale.

🔍 Inside the post:
– One Model, One File: readability first — you can still open a modeling file and see the full logic, top to bottom.
– Modular Transformers: visible inheritance that cuts maintenance cost by ~15× while keeping models readable.
– Config-Driven Performance: FlashAttention, tensor parallelism, and attention scheduling are config-level features, not rewrites.

Written with @lysandre ,@pcuenq and @yonigozlan , this is a deep dive into how Transformers stays fast, open, and maintainable.

Read it here → transformers-community/Transformers-tenets

liked 4 models about 1 month ago

upvoted 2 papers about 1 month ago

In-Context Representation Hijacking

Paper • 2512.03771 • Published Dec 3, 2025 • 3

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 149

liked a model about 1 month ago

microsoft/VibeVoice-Realtime-0.5B

Text-to-Speech • 1B • Updated 28 days ago • 305k • 1.05k

r PRO

AI & ML interests

Recent Activity

Organizations

oceansweep's activity