Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
v3rganz 's Collections
queue

queue

updated May 11
Upvote
-

  • I-Con: A Unifying Framework for Representation Learning

    Paper • 2504.16929 • Published Apr 23 • 30

  • LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities

    Paper • 2504.16078 • Published Apr 22 • 20

  • WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

    Paper • 2504.15785 • Published Apr 22 • 19

  • OTC: Optimal Tool Calls via Reinforcement Learning

    Paper • 2504.14870 • Published Apr 21 • 33

  • Reinforcement Learning for Reasoning in Large Language Models with One Training Example

    Paper • 2504.20571 • Published Apr 29 • 97

  • ReasonIR: Training Retrievers for Reasoning Tasks

    Paper • 2504.20595 • Published Apr 29 • 55

  • Taming the Titans: A Survey of Efficient LLM Inference Serving

    Paper • 2504.19720 • Published Apr 28 • 12

  • DoRA: Weight-Decomposed Low-Rank Adaptation

    Paper • 2402.09353 • Published Feb 14, 2024 • 27

  • SWE-smith: Scaling Data for Software Engineering Agents

    Paper • 2504.21798 • Published Apr 30 • 10

  • s1: Simple test-time scaling

    Paper • 2501.19393 • Published Jan 31 • 126
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs