Post
943
This paper introduced the notion of "Tests as Prompt". It compiled results and findings of WebApp1K published in previous three papers.
https://huggingface.co/papers?q=2505.09027
The central argument here is that test-driven development is a natural fit to LLMs, which scale better than humans. I bet the future will see thousands of such leaderboards (many more proprietary ones), each dominated by a specialized model.
https://huggingface.co/papers?q=2505.09027
The central argument here is that test-driven development is a natural fit to LLMs, which scale better than humans. I bet the future will see thousands of such leaderboards (many more proprietary ones), each dominated by a specialized model.