LLM Hallucination Detection Papers Collection Collection of LLM hallucination and evaluation papers that I've been exploring and implementing. Some of them have my comments and annotated doodles. • 12 items • Updated Feb 20, 2024 • 13
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published Feb 3 • 40
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published Feb 3 • 115
Can Language Models Replace Programmers? REPOCOD Says 'Not Yet' Paper • 2410.21647 • Published Oct 29, 2024 • 17
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions Paper • 2410.20424 • Published Oct 27, 2024 • 41
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14, 2024 • 52
What Matters in Transformers? Not All Attention is Needed Paper • 2406.15786 • Published Jun 22, 2024 • 32
Law of the Weakest Link: Cross Capabilities of Large Language Models Paper • 2409.19951 • Published Sep 30, 2024 • 55
Prithvi WxC: Foundation Model for Weather and Climate Paper • 2409.13598 • Published Sep 20, 2024 • 44
ClimDetect: A Benchmark Dataset for Climate Change Detection and Attribution Paper • 2408.15993 • Published Aug 28, 2024 • 8
TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT Paper • 2307.08674 • Published Jul 17, 2023 • 48