The collection for the Paper "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
HKUST NLP Group
university
AI & ML interests
None defined yet.
Recent Activity
The collection for the Paper "Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping"
The collection for the Project "Simple Reinforcement Learning for Reasoning"
Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/
The collection for the Paper "Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning."
-
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Verifier-1.5B
Reinforcement Learning • 8B • Updated • 11 • 1 -
hkust-nlp/R1-Distill-Verifier-1.5B
2B • Updated • 10 • 1 -
hkust-nlp/Qwen-2.5-7B-Verifier-HF
Reinforcement Learning • 8B • Updated • 7 -
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Qwen-1.5B
Reinforcement Learning • 8B • Updated • 6
The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
Collection for CodeI/O @ https://codei-o.github.io/
Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving [NeurIPS 2024] @ https://github.com/hkust-nlp/dart-math
-
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Paper • 2407.13690 • Published • 2 -
hkust-nlp/dart-math-hard
Viewer • Updated • 585k • 132 • 14 -
hkust-nlp/dart-math-dsmath-7b-prop2diff
Text Generation • 7B • Updated • 11 • 3 -
hkust-nlp/dart-math-llama3-8b-prop2diff
Text Generation • 8B • Updated • 1.13k • 1
The collection for the Paper "WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents"
The collection for the Paper "Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning."
-
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Verifier-1.5B
Reinforcement Learning • 8B • Updated • 11 • 1 -
hkust-nlp/R1-Distill-Verifier-1.5B
2B • Updated • 10 • 1 -
hkust-nlp/Qwen-2.5-7B-Verifier-HF
Reinforcement Learning • 8B • Updated • 7 -
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Qwen-1.5B
Reinforcement Learning • 8B • Updated • 6
The collection for the Paper "Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shaping"
The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild"
The collection for the Project "Simple Reinforcement Learning for Reasoning"
Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/
Collection for CodeI/O @ https://codei-o.github.io/
Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving [NeurIPS 2024] @ https://github.com/hkust-nlp/dart-math
-
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Paper • 2407.13690 • Published • 2 -
hkust-nlp/dart-math-hard
Viewer • Updated • 585k • 132 • 14 -
hkust-nlp/dart-math-dsmath-7b-prop2diff
Text Generation • 7B • Updated • 11 • 3 -
hkust-nlp/dart-math-llama3-8b-prop2diff
Text Generation • 8B • Updated • 1.13k • 1