rl-rag/qwen3-8b-base-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 16 days ago • 15
rl-rag/qwen3-8b-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 16 days ago • 14
rl-rag/qwen3-4b-it-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 4B • Updated 16 days ago • 12
rl-rag/qwen2.5-7b-combined-sft-training-data-v20250824_MiroSystemPrompt Text Generation • 8B • Updated 16 days ago • 13
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_call_tool Viewer • Updated 13 days ago • 2.94k • 145
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_rubrics_only_with_new_mcp_system_prompt Viewer • Updated 16 days ago • 2.94k • 149
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_longform_averaged_outcome_with_system_prompt Viewer • Updated 16 days ago • 2.94k • 120
rl-rag/rl_rag_sqa_searcharena_rubrics_web_augmented_outcome_with_new_mcp_system_prompt Viewer • Updated 16 days ago • 2.94k • 76