view article Article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages By davanstrien and 5 others • 2 days ago • 23
Kontext Dev LoRAs Collection Collection of Kontext Dev LoRAs by fal • 23 items • Updated about 18 hours ago • 9
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 88
view article Article TinyAgents: A Minimal Experiment with Code Agents and MCP Tools By albertvillanova • May 16 • 30
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback Paper • 2309.00267 • Published Sep 1, 2023 • 50
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning Paper • 2504.20835 • Published Apr 29 • 1
Phi-4 (All Versions) Collection Microsoft's Phi-4 models including Reasoning + Reasoning Plus & mini. Includes Dynamic 2.0 GGUF, 4-bit & 16-bit versions. Includes Unsloth's bug fixes • 20 items • Updated 8 days ago • 71
view article Article ChatGPT-4o's Image Generation Capabilities and Its Wild Examples By prithivMLmods • Apr 5 • 20
view article Article Preference Optimization for Vision Language Models By qgallouedec and 3 others • Jul 10, 2024 • 79