muses-llm/humaneval_qwen7b_gpt-4o-mini_att_iter0_att100_sol10_snap Viewer • Updated Apr 19 • 6.89k • 9
muses-llm/humaneval_qwen7b_gpt-4o-mini_att_iter0_att100_sol10_snap Viewer • Updated Apr 19 • 6.89k • 9
muses-llm/humaneval_qwen7b_gpt-4o-mini_att_iter0_att20_sol10_snap Viewer • Updated Apr 18 • 1.37k • 10
muses-llm/humaneval_qwen7b_gpt-4o-mini_att_iter0_att20_sol10_snap Viewer • Updated Apr 18 • 1.37k • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol10_rerun_worker4_relabeled_dpo_6000 Viewer • Updated Apr 16 • 7.5k • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol10_rerun_worker4_relabeled_dpo_6000 Viewer • Updated Apr 16 • 7.5k • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol10_rerun_worker4 Viewer • Updated Mar 12 • 15.9k • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol10_rerun_worker4 Viewer • Updated Mar 12 • 15.9k • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol50_rerun_worker4 Viewer • Updated Mar 11 • 926 • 10
muses-llm/bigcodebench_qwen7b_att_iter0_ppo_att20_sol50_rerun_worker4 Viewer • Updated Mar 11 • 926 • 10
Discover and Cure: Concept-aware Mitigation of Spurious Correlation Paper • 2305.00650 • Published May 1, 2023
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases Paper • 2404.13207 • Published Apr 19, 2024
AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning Paper • 2406.11200 • Published Jun 17, 2024