SPaRK-RL combines reinforcement learning (RL) and large language models (LLMs) to improve exploration using diverse tool generation during inference gabrielbo/explore-rl-hotpota-trajectories Updated May 9 • 4 gabrielbo/swirl-trajectories-mmlu-pro Viewer • Updated May 20 • 24.8k • 21 • 2 gabrielbo/spark-model-QLoRA Text Generation • Updated May 24 • 1
SPaRK-RL combines reinforcement learning (RL) and large language models (LLMs) to improve exploration using diverse tool generation during inference gabrielbo/explore-rl-hotpota-trajectories Updated May 9 • 4 gabrielbo/swirl-trajectories-mmlu-pro Viewer • Updated May 20 • 24.8k • 21 • 2 gabrielbo/spark-model-QLoRA Text Generation • Updated May 24 • 1