Submitted by
Tu Trinh
Scale AI
company
Verified
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?
SciPredict: Can LLMs Predict the Outcomes of Scientific Experiments in Natural Sciences?