DeepCodeSeek: Real-Time API Retrieval for Context-Aware Code Generation
Abstract
A novel technique for predicting APIs and generating code in real-time using a compact reranker outperforms larger models with reduced latency, addressing API leaks and unclear usage intent in enterprise code.
Current search techniques are limited to standard RAG query-document applications. In this paper, we propose a novel technique to expand the code and index for predicting the required APIs, directly enabling high-quality, end-to-end code generation for auto-completion and agentic AI applications. We address the problem of API leaks in current code-to-code benchmark datasets by introducing a new dataset built from real-world ServiceNow Script Includes that capture the challenge of unclear API usage intent in the code. Our evaluation metrics show that this method achieves 87.86% top-40 retrieval accuracy, allowing the critical context with APIs needed for successful downstream code generation. To enable real-time predictions, we develop a comprehensive post-training pipeline that optimizes a compact 0.6B reranker through synthetic dataset generation, supervised fine-tuning, and reinforcement learning. This approach enables our compact reranker to outperform a much larger 8B model while maintaining 2.5x reduced latency, effectively addressing the nuances of enterprise-specific code without the computational overhead of larger models.
Community
A multi-stage retrieval system that achieves 87.86% top-40 accuracy for API prediction in enterprise code completion. The team developed a compact 0.6B reranker that outperforms 8B models while maintaining 2.5x faster inference through synthetic data generation, supervised fine-tuning, and reinforcement learning. Tackles real-world ServiceNow Script Include retrieval by combining knowledge graph filtering, enriched JSDoc indexing, and LLM-powered query enhancement.
Paper: https://arxiv.org/abs/2509.25716
Open-source library: https://github.com/ServiceNow/snowdoc
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SynthCoder: A Synthetical Strategy to Tune LLMs for Code Completion (2025)
- Tool-integrated Reinforcement Learning for Repo Deep Search (2025)
- Enhancing LLM-based Fault Localization with a Functionality-Aware Retrieval-Augmented Generation Framework (2025)
- RefactorCoderQA: Benchmarking LLMs for Multi-Domain Coding Question Solutions in Cloud and Edge Deployment (2025)
- ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation (2025)
- Think Less, Label Better: Multi-Stage Domain-Grounded Synthetic Data Generation for Fine-Tuning Large Language Models in Telecommunications (2025)
- Impact-driven Context Filtering For Cross-file Code Completion (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper