MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 4 items • Updated 7 days ago • 99
General-Reasoner Collection Advancing LLMs' general reasoning capabilities • 9 items • Updated 1 day ago • 4
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 • Sep 3, 2024 • 36
view article Article The N Implementation Details of RLHF with PPO By vwxyzjn and 2 others • Oct 24, 2023 • 59
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks By terryyz and 8 others • Jun 18, 2024 • 49
[lecture artifacts] aligning open language models Collection artifacts referenced in the talk timeline! Slides: https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit?usp=sharin • 63 items • Updated Apr 17, 2024 • 56
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4, 2024 • 62
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models Paper • 2310.08491 • Published Oct 12, 2023 • 55