Arkil Patel's picture

2 8 1

Arkil Patel

arkilpatel

·

https://arkilpatel.github.io/

AI & ML interests

NLP

Organizations

authored 2 papers 3 months ago

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories

Paper • 2504.08942 • Published Apr 11 • 27

SafeArena: Evaluating the Safety of Autonomous Web Agents

Paper • 2503.04957 • Published Mar 6 • 21

authored 2 papers 5 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 76

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published Feb 20 • 18

authored 3 papers about 1 year ago

Universal Adversarial Triggers Are Not Universal

Paper • 2404.16020 • Published Apr 24, 2024

Are NLP Models really able to Solve Simple Math Word Problems?

Paper • 2103.07191 • Published Mar 12, 2021 • 1

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions

Paper • 2310.03016 • Published Oct 4, 2023 • 2