OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use Paper • 2508.04482 • Published 9 days ago • 9
Hidden Dynamics of Massive Activations in Transformer Training Paper • 2508.03616 • Published 10 days ago • 17
InfiGUI-G1: Advancing GUI Grounding with Adaptive Exploration Policy Optimization Paper • 2508.05731 • Published 8 days ago • 24
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published 7 days ago • 139
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval By aamirshakir and 2 others • Mar 22, 2024 • 100
view article Article You could have designed state of the art positional encoding By FL33TW00D-HF • Nov 25, 2024 • 340
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 149
view article Article Finally, a Replacement for BERT: Introducing ModernBERT By bclavie and 14 others • Dec 19, 2024 • 679
ReasonMed: A 370K Multi-Agent Generated Dataset for Advancing Medical Reasoning Paper • 2506.09513 • Published Jun 11 • 98
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models Paper • 2506.06395 • Published Jun 5 • 129
RoFormer: Enhanced Transformer with Rotary Position Embedding Paper • 2104.09864 • Published Apr 20, 2021 • 14
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective Paper • 2503.01933 • Published Mar 3 • 12
Phantom: Subject-consistent video generation via cross-modal alignment Paper • 2502.11079 • Published Feb 16 • 60
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking Paper • 2501.00244 • Published Dec 31, 2024 • 1
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 21
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 241