Models deployed on HuggingFace or RunPods.
AI & ML interests
LLM Evaluation
Papers
View all Papers
A benchmark for tip-of-the-tongue search and reasoning.
-
PatronusAI/lynx-70b-instruct-covidqa-generations
Viewer • Updated • 1k • 16 -
PatronusAI/lynx-70b-instruct-drop-generations
Viewer • Updated • 1k • 22 -
PatronusAI/lynx-70b-instruct-financebench-generations
Viewer • Updated • 1k • 14 -
PatronusAI/lynx-70b-instruct-halueval-generations
Viewer • Updated • 10k • 17
Models deployed on HuggingFace or RunPods.
A benchmark for tip-of-the-tongue search and reasoning.
-
PatronusAI/lynx-70b-instruct-covidqa-generations
Viewer • Updated • 1k • 16 -
PatronusAI/lynx-70b-instruct-drop-generations
Viewer • Updated • 1k • 22 -
PatronusAI/lynx-70b-instruct-financebench-generations
Viewer • Updated • 1k • 14 -
PatronusAI/lynx-70b-instruct-halueval-generations
Viewer • Updated • 10k • 17