Edinburgh Dataset Analytics Working Group

university

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

aryopg authored a paper 13 days ago

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

yuzhaouoe authored a paper about 1 month ago

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

rohitsaxena authored a paper about 2 months ago

Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs

View all activity

edinburgh-dawg's activity

aryopg

authored a paper 13 days ago

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

Paper • 2503.23415 • Published 16 days ago • 1

yuzhaouoe

authored a paper about 1 month ago

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

Paper • 2503.02812 • Published Mar 4 • 9

rohitsaxena

authored 2 papers about 2 months ago

Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs

Paper • 2502.05092 • Published Feb 7 • 8

PosterSum: A Multimodal Benchmark for Scientific Poster Summarization

Paper • 2502.17540 • Published Feb 24 • 3

aryopg

in edinburgh-dawg/mmlu-redux-2.0 about 2 months ago

Report: no correct answer

#2 opened 2 months ago by

ZhangRC

aryopg

updated a dataset about 2 months ago

edinburgh-dawg/mmlu-redux-2.0

Viewer • Updated Feb 25 • 5.7k • 1.61k • 18

HEmile

authored a paper about 2 months ago

Mixtures of In-Context Learners

Paper • 2411.02830 • Published Nov 5, 2024 • 2

aryopg

authored a paper 2 months ago

Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs

Paper • 2502.05092 • Published Feb 7 • 8

pminervini

authored 6 papers 2 months ago

FLARE: Faithful Logic-Aided Reasoning and Exploration

Paper • 2410.11900 • Published Oct 14, 2024 • 4

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

Paper • 2406.14425 • Published Jun 20, 2024 • 2

aryopg

updated a dataset 2 months ago

edinburgh-dawg/mmlu-redux

Viewer • Updated Feb 9 • 3k • 3.2k • 32

acDante

authored 2 papers 5 months ago

The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models

Paper • 2404.05904 • Published Apr 8, 2024 • 9

Are We Done with MMLU?

Paper • 2406.04127 • Published Jun 6, 2024 • 40

Jforeverss

authored a paper 5 months ago

CoMAT: Chain of Mathematically Annotated Thought Improves Mathematical Reasoning

Paper • 2410.10336 • Published Oct 14, 2024 • 2

acDante

authored 2 papers 5 months ago

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 20

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7

AI & ML interests

Recent Activity

Team members 14

edinburgh-dawg's activity

Report: no correct answer