EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning Paper • 2509.22576 • Published 7 days ago • 123 • 2
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 62 • 5
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 62 • 5
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training Paper • 2504.09710 • Published Apr 13 • 19 • 2
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering Paper • 2502.03628 • Published Feb 5 • 12 • 3
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information Steering Paper • 2502.03628 • Published Feb 5 • 12 • 3
MLLM-as-a-Judge for Image Safety without Human Labeling Paper • 2501.00192 • Published Dec 31, 2024 • 31 • 2