EQ-Bench: An Emotional Intelligence Benchmark for Large Language Models Paper • 2312.06281 • Published Dec 11, 2023 • 2
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published Sep 10, 2024 • 69