Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ryokamoi
's Collections
FoVer
VisOnlyQA
ReaLMistake
ReaLMistake
updated
3 days ago
Benchmark for error detection in LLM responses
Upvote
-
Evaluating LLMs at Detecting Errors in LLM Responses
Paper
•
2404.03602
•
Published
Apr 4, 2024
•
2
ryokamoi/realmistake
Viewer
•
Updated
Aug 18, 2024
•
903
•
26
•
2
Upvote
-
Share collection
View history
Collection guide
Browse collections