IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons Paper • 2406.18406 • Published Jun 26, 2024
Sibyl: Simple yet Effective Agent Framework for Complex Real-world Reasoning Paper • 2407.10718 • Published Jul 15, 2024 • 19
Benchmarks Underestimate the Readiness of Multi-lingual Dialogue Agents Paper • 2405.17840 • Published May 28, 2024
ChatMusician: Understanding and Generating Music Intrinsically with LLM Paper • 2402.16153 • Published Feb 25, 2024 • 61
A Hybrid Task-Oriented Dialog System with Domain and Task Adaptive Pretraining Paper • 2102.04506 • Published Feb 8, 2021
X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents Paper • 2306.17674 • Published Jun 30, 2023
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code Paper • 2206.11249 • Published Jun 22, 2022
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement Paper • 2402.14658 • Published Feb 22, 2024 • 84
RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models Paper • 2312.16132 • Published Dec 26, 2023 • 2