kmorrow1
's Collections
My Collection
updated
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
•
2501.04519
•
Published
•
230
Learning an evolved mixture model for task-free continual learning
Paper
•
2207.05080
•
Published
•
1
EVOLvE: Evaluating and Optimizing LLMs For Exploration
Paper
•
2410.06238
•
Published
•
1
Smaller Language Models Are Better Instruction Evolvers
Paper
•
2412.11231
•
Published
•
27
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page
Understanding and Grounding?
Paper
•
2404.05955
•
Published
An Evolved Universal Transformer Memory
Paper
•
2410.13166
•
Published
•
3
AgentGym: Evolving Large Language Model-based Agents across Diverse
Environments
Paper
•
2406.04151
•
Published
•
19
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table
Understanding
Paper
•
2401.04398
•
Published
•
22
EvoCodeBench: An Evolving Code Generation Benchmark with Domain-Specific
Evaluations
Paper
•
2410.22821
•
Published
•
1
Learning Evolving Tools for Large Language Models
Paper
•
2410.06617
•
Published
•
2
PortLLM: Personalizing Evolving Large Language Models with Training-Free
and Portable Model Patches
Paper
•
2410.10870
•
Published
•
1
Generating and Evolving Reward Functions for Highway Driving with Large
Language Models
Paper
•
2406.10540
•
Published
•
1
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and
Evolution
Paper
•
2410.16256
•
Published
•
60
MUSCLE: A Model Update Strategy for Compatible LLM Evolution
Paper
•
2407.09435
•
Published
•
22
GAVEL: Generating Games Via Evolution and Language Models
Paper
•
2407.09388
•
Published
•
16
Reward Steering with Evolutionary Heuristics for Decoding-time Alignment
Paper
•
2406.15193
•
Published
•
14
Evolutionary Optimization of Model Merging Recipes
Paper
•
2403.13187
•
Published
•
51
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
•
2501.04682
•
Published
•
83
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
•
2501.03226
•
Published
•
35
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation
Models
Paper
•
2501.00316
•
Published
•
22
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper
•
2501.05366
•
Published
•
75
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
•
2501.04686
•
Published
•
48
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning
and Reflection
Paper
•
2501.04575
•
Published
•
22
Efficiently Serving LLM Reasoning Programs with Certaindex
Paper
•
2412.20993
•
Published
•
35
Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via
Collective Monte Carlo Tree Search
Paper
•
2412.18319
•
Published
•
37
Token-Budget-Aware LLM Reasoning
Paper
•
2412.18547
•
Published
•
45
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
Paper
•
2412.17256
•
Published
•
45
ShowUI: One Vision-Language-Action Model for GUI Visual Agent
Paper
•
2411.17465
•
Published
•
78
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper
•
2412.04454
•
Published
•
59
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web
Tutorials
Paper
•
2412.09605
•
Published
•
28
OmniManip: Towards General Robotic Manipulation via Object-Centric
Interaction Primitives as Spatial Constraints
Paper
•
2501.03841
•
Published
•
49
Agents for self-driving laboratories applied to quantum computing
Paper
•
2412.07978
•
Published
•
1
Towards Scientific Discovery with Generative AI: Progress,
Opportunities, and Challenges
Paper
•
2412.11427
•
Published
•
1
AEGIS: An Agent-based Framework for General Bug Reproduction from Issue
Descriptions
Paper
•
2411.18015
•
Published
•
1
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper
•
2501.04306
•
Published
•
33
Using Generative AI and Multi-Agents to Provide Automatic Feedback
Paper
•
2411.07407
•
Published
•
1
Designing Reliable Experiments with Generative Agent-Based Modeling: A
Comprehensive Guide Using Concordia by Google DeepMind
Paper
•
2411.07038
•
Published
•
1
Agent Laboratory: Using LLM Agents as Research Assistants
Paper
•
2501.04227
•
Published
•
77
A Multi-AI Agent System for Autonomous Optimization of Agentic AI
Solutions via Iterative Refinement and LLM-Driven Feedback Loops
Paper
•
2412.17149
•
Published
•
1
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper
•
2501.05707
•
Published
•
18
Enabling Scalable Oversight via Self-Evolving Critic
Paper
•
2501.05727
•
Published
•
64
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Paper
•
2501.01895
•
Published
•
48
Understanding Self-Predictive Learning for Reinforcement Learning
Paper
•
2212.03319
•
Published
Grokfast: Accelerated Grokking by Amplifying Slow Gradients
Paper
•
2405.20233
•
Published
•
6
Paper
•
2402.09470
•
Published
•
11
Vid2Robot: End-to-end Video-conditioned Policy Learning with
Cross-Attention Transformers
Paper
•
2403.12943
•
Published
•
15
TinyFusion: Diffusion Transformers Learned Shallow
Paper
•
2412.01199
•
Published
•
14
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Paper
•
2501.06186
•
Published
•
55
Test of Time: A Benchmark for Evaluating LLMs on Temporal Reasoning
Paper
•
2406.09170
•
Published
•
26
Demystifying Domain-adaptive Post-training for Financial LLMs
Paper
•
2501.04961
•
Published
•
10
Enhancing Human-Like Responses in Large Language Models
Paper
•
2501.05032
•
Published
•
46
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
•
2501.07301
•
Published
•
72
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
•
2501.08313
•
Published
•
258
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them
Paper
•
2501.08292
•
Published
•
16
PokerBench: Training Large Language Models to become Professional Poker
Players
Paper
•
2501.08328
•
Published
•
13
Tarsier2: Advancing Large Vision-Language Models from Detailed Video
Description to Comprehensive Video Understanding
Paper
•
2501.07888
•
Published
•
12
Potential and Perils of Large Language Models as Judges of Unstructured
Textual Data
Paper
•
2501.08167
•
Published
•
6
Tensor Product Attention Is All You Need
Paper
•
2501.06425
•
Published
•
66
Transformer^2: Self-adaptive LLMs
Paper
•
2501.06252
•
Published
•
46
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training
Paper
•
2501.06842
•
Published
•
14
WebWalker: Benchmarking LLMs in Web Traversal
Paper
•
2501.07572
•
Published
•
18
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
•
2501.06458
•
Published
•
29
Evaluating Sample Utility for Data Selection by Mimicking Model Weights
Paper
•
2501.06708
•
Published
•
5
ChemAgent: Self-updating Library in Large Language Models Improves
Chemical Reasoning
Paper
•
2501.06590
•
Published
•
7
OmniThink: Expanding Knowledge Boundaries in Machine Writing through
Thinking
Paper
•
2501.09751
•
Published
•
29
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
Paper
•
2501.08617
•
Published
•
7
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Paper
•
2501.09755
•
Published
•
21
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
•
2501.09686
•
Published
•
16
FAST: Efficient Action Tokenization for Vision-Language-Action Models
Paper
•
2501.09747
•
Published
•
12