ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published Feb 3 • 18
SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories Paper • 2409.07440 • Published Sep 11, 2024 • 8
The Code2Text Challenge: Text Generation in Source Code Libraries Paper • 1708.00098 • Published Jul 31, 2017
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31, 2024 • 65
Learning Semantic Correspondences in Technical Documentation Paper • 1705.04815 • Published May 13, 2017
Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models Paper • 2009.00751 • Published Sep 1, 2020
Temporal Reasoning on Implicit Events from Distant Supervision Paper • 2010.12753 • Published Oct 24, 2020
Hey AI, Can You Solve Complex Tasks by Talking to Agents? Paper • 2110.08542 • Published Oct 16, 2021
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge Paper • 1912.13337 • Published Dec 31, 2019
A Dataset for Tracking Entities in Open Domain Procedural Text Paper • 2011.08092 • Published Oct 31, 2020
Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation Paper • 2004.14623 • Published Apr 30, 2020
MonaLog: a Lightweight System for Natural Language Inference Based on Monotonicity Paper • 1910.08772 • Published Oct 19, 2019
Investigating Transfer Learning in Multilingual Pre-trained Language Models through Chinese Natural Language Inference Paper • 2106.03983 • Published Jun 7, 2021
Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts Paper • 2210.16865 • Published Oct 30, 2022
Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs Paper • 2211.07950 • Published Nov 15, 2022