Ai2

non-profit

Verified

https://allenai.org/

allen_ai

allenai

AI & ML interests

Building breatkthrough AI to solve the world's biggest problems.

Recent Activity

hqfang published a dataset 2 days ago

allenai/libero

hqfang published a model 2 days ago

allenai/MolmoAct-7B-D-Captioner-0812

lihaoxin2020 authored a paper 10 days ago

Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning

View all activity

Articles

Introducing the Open Chain of Thought Leaderboard

hqfang

published a dataset 2 days ago

allenai/libero

Viewer • Updated 12 days ago • 521k • 11

hqfang

published a model 2 days ago

allenai/MolmoAct-7B-D-Captioner-0812

Robotics • 8B • Updated 3 days ago • 2

MahtabBg

authored a paper about 1 month ago

MedBLINK: Probing Basic Perception in Multimodal Language Models for Medicine

Paper • 2508.02951 • Published Aug 4

yilunzhao

authored 2 papers about 2 months ago

Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers

Paper • 2507.02694 • Published Jul 3 • 18

Can Multimodal Foundation Models Understand Schematic Diagrams? An Empirical Study on Information-Seeking QA over Scientific Papers

Paper • 2507.10787 • Published Jul 14 • 11

yuntian-deng

authored a paper about 2 months ago

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11 • 79

yilunzhao

authored a paper 2 months ago

Efficiency-Effectiveness Reranking FLOPs for LLM-based Rerankers

Paper • 2507.06223 • Published Jul 8 • 13

sewon

authored 13 papers 2 months ago

BTR: Binary Token Representations for Efficient Retrieval Augmented Language Models

Paper • 2310.01329 • Published Oct 2, 2023

UnifiedQA: Crossing Format Boundaries With a Single QA System

Paper • 2005.00700 • Published May 2, 2020

Dense Passage Retrieval for Open-Domain Question Answering

Paper • 2004.04906 • Published Apr 10, 2020 • 2

AmbigQA: Answering Ambiguous Open-domain Questions

Paper • 2004.10645 • Published Apr 22, 2020

FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation

Paper • 2305.14251 • Published May 23, 2023 • 2

MetaICL: Learning to Learn In Context

Paper • 2110.15943 • Published Oct 29, 2021

Prompt Waywardness: The Curious Case of Discretized Interpretation of Continuous Prompts

Paper • 2112.08348 • Published Dec 15, 2021

Do Membership Inference Attacks Work on Large Language Models?

Paper • 2402.07841 • Published Feb 12, 2024

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Paper • 2202.12837 • Published Feb 25, 2022 • 1

Measuring and Narrowing the Compositionality Gap in Language Models

Paper • 2210.03350 • Published Oct 7, 2022

Exploring The Landscape of Distributional Robustness for Question Answering Models

Paper • 2210.12517 • Published Oct 22, 2022

CREPE: Open-Domain Question Answering with False Presuppositions

Paper • 2211.17257 • Published Nov 30, 2022

Nonparametric Masked Language Modeling

Paper • 2212.01349 • Published Dec 2, 2022 • 1