simonycl/gsm8k_training_negative_combined_1k_gemini-2.5-flash_transformed Viewer • Updated Sep 11, 2025 • 1.76k • 1
simonycl/gsm8k_training_negative_vs_standard_1k_gemini-2.5-flash_transformed Viewer • Updated Sep 11, 2025 • 1.7k • 2
simonycl/gsm8k_training_negative_sequence_1k_gemini-2.5-flash_transformed Viewer • Updated Sep 11, 2025 • 1.78k
simonycl/gsm8k_training_negative_direct_1k_gemini-2.5-flash_transformed Viewer • Updated Sep 11, 2025 • 1.49k • 1
simonycl/gsm8k_training_negative_combined_1k_gpt-4.1_transformed Viewer • Updated Sep 11, 2025 • 1.92k • 1
simonycl/gsm8k_training_negative_vs_standard_1k_gpt-4.1_transformed Viewer • Updated Sep 11, 2025 • 1.93k • 2
simonycl/gsm8k_training_negative_sequence_1k_gpt-4.1_transformed Viewer • Updated Sep 11, 2025 • 1.88k • 1
simonycl/gsm8k_training_negative_direct_1k_gpt-4.1_transformed Viewer • Updated Sep 11, 2025 • 1.65k • 1
simonycl/game-eval-Qwen-Qwen3-32B-vs-Qwen-Qwen3-32B-20250908-101728 Viewer • Updated Sep 8, 2025 • 11.5k • 1
simonycl/game-eval-Qwen-Qwen3-32B-vs-Qwen-Qwen3-32B-20250908-101654 Viewer • Updated Sep 8, 2025 • 5.72k • 1
simonycl/game-eval-Qwen-Qwen3-32B-vs-Qwen-Qwen3-32B-20250908-101501 Viewer • Updated Sep 8, 2025 • 2.3k • 1
simonycl/game-eval-Qwen-Qwen3-32B-vs-Qwen-Qwen3-32B-20250907-135811 Viewer • Updated Sep 7, 2025 • 46.1k • 1
simonycl/game-eval-Qwen-QwQ-32B-vs-Qwen-QwQ-32B-20250829-232628 Viewer • Updated Aug 30, 2025 • 46.7k • 1
simonycl/game-eval-Qwen-QwQ-32B-vs-Qwen-QwQ-32B-20250729-131038 Viewer • Updated Jul 29, 2025 • 59k • 1
simonycl/game-eval-Qwen-QwQ-32B-vs-Qwen-QwQ-32B-20250716-180926 Viewer • Updated Jul 17, 2025 • 60k • 1
simonycl/game-eval-qwen-Qwen3-32B-vs-qwen-Qwen3-32B-20250716-040414 Viewer • Updated Jul 16, 2025 • 54.3k • 1
simonycl/game-eval-Qwen-QwQ-32B-vs-openai-gpt-4.1-mini-20250715-100319 Viewer • Updated Jul 15, 2025 • 38.7k • 1
simonycl/game-eval-qwen-Qwen3-32B-vs-openai-gpt-4.1-mini-20250715-100300 Viewer • Updated Jul 15, 2025 • 34.7k
simonycl/game-eval-qwen-Qwen3-14B-vs-openai-gpt-4.1-mini-20250715-095932 Viewer • Updated Jul 15, 2025 • 35k • 1
simonycl/game-eval-qwen-Qwen3-8B-vs-openai-gpt-4.1-mini-20250715-095908 Viewer • Updated Jul 15, 2025 • 33.1k • 3
simonycl/game-eval-qwen-Qwen3-8B-vs-openai-gpt-4.1-mini-20250715-095731 Viewer • Updated Jul 15, 2025 • 33.1k • 1
simonycl/game-eval-qwen-Qwen3-4B-vs-openai-gpt-4.1-mini-20250715-095656 Viewer • Updated Jul 15, 2025 • 34k • 1
simonycl/game-eval-qwen-Qwen3-1.7B-vs-openai-gpt-4.1-mini-20250715-095147 Viewer • Updated Jul 15, 2025 • 30k • 1