Running on CPU Upgrade 70 70 AIR-Bench Leaderboard 🥇 Explore benchmark results for QA and long doc models
Prompt Cache: Modular Attention Reuse for Low-Latency Inference Paper • 2311.04934 • Published Nov 7, 2023 • 34
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 595