Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2504.17192

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published Dec 31, 2024 • 31
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 108
Xmodel-2 Technical Report

Paper • 2412.19638 • Published Dec 27, 2024 • 27
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 102

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published 4 days ago • 81

about 1 hour ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published 21 days ago • 125
TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published 5 days ago • 90
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published 4 days ago • 81

To Read collection

interesting papers to read

about 10 hours ago

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 27 days ago • 63
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24 • 118
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 111
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 122

about 1 hour ago

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Paper • 2503.13434 • Published Mar 17 • 26
Edit Transfer: Learning Image Editing via Vision In-Context Relations

Paper • 2503.13327 • Published Mar 17 • 29
WideRange4D: Enabling High-Quality 4D Reconstruction with Wide-Range Movements and Scenes

Paper • 2503.13435 • Published Mar 17 • 17
MediaTek-Research/Llama-Breeze2-8B-Instruct

Updated Mar 2 • 2.04k • 35

Methods of Tough

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Paper • 2502.02508 • Published Feb 4 • 23
Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 48
Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

Paper • 2406.02818 • Published Jun 4, 2024
Chain-of-Retrieval Augmented Generation

Paper • 2501.14342 • Published Jan 24 • 56

RuCCoD: Towards Automated ICD Coding in Russian

Paper • 2502.21263 • Published Feb 28 • 133
Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7 • 123
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

Paper • 2503.05179 • Published Mar 7 • 46
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Paper • 2503.05592 • Published Mar 7 • 27

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 111
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 99
ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published 11 days ago • 40
OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published 7 days ago • 32

AI-Automated Scientific Research

SurveyX: Academic Survey Automation via Large Language Models

Paper • 2502.14776 • Published Feb 20 • 100
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

Paper • 2408.06292 • Published Aug 12, 2024 • 125
Towards an AI co-scientist

Paper • 2502.18864 • Published Feb 26 • 49
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

Paper • 2504.17192 • Published 4 days ago • 81

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published Feb 20 • 20
PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving

Paper • 2502.16111 • Published Feb 22 • 9
Agent models: Internalizing Chain-of-Action Generation into Reasoning models

Paper • 2503.06580 • Published Mar 9 • 17
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 7

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs