TechniqueRAG: Retrieval Augmented Generation for Adversarial Technique Annotation in Cyber Threat Intelligence Text
Abstract
Accurately identifying adversarial techniques in security texts is critical for effective cyber defense. However, existing methods face a fundamental trade-off: they either rely on generic models with limited domain precision or require resource-intensive pipelines that depend on large labeled datasets and task-specific optimizations, such as custom hard-negative mining and denoising, resources rarely available in specialized domains. We propose TechniqueRAG, a domain-specific retrieval-augmented generation (RAG) framework that bridges this gap by integrating off-the-shelf retrievers, instruction-tuned LLMs, and minimal text-technique pairs. Our approach addresses data scarcity by fine-tuning only the generation component on limited in-domain examples, circumventing the need for resource-intensive retrieval training. While conventional RAG mitigates hallucination by coupling retrieval and generation, its reliance on generic retrievers often introduces noisy candidates, limiting domain-specific precision. To address this, we enhance retrieval quality and domain specificity through zero-shot LLM re-ranking, which explicitly aligns retrieved candidates with adversarial techniques. Experiments on multiple security benchmarks demonstrate that TechniqueRAG achieves state-of-the-art performance without extensive task-specific optimizations or labeled data, while comprehensive analysis provides further insights.
Community
Summary:
TechniqueRAG is a domain-specific retrieval-augmented generation (RAG) framework for identifying adversarial techniques in cybersecurity texts. It avoids the limitations of generic models and resource-heavy pipelines by fine-tuning only the generation component using minimal data. To improve precision, it uses zero-shot LLM re-ranking to refine retrieved results. TechniqueRAG outperforms existing methods on security benchmarks without needing extensive labeled data or custom optimizations.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- One Shot Dominance: Knowledge Poisoning Attack on Retrieval-Augmented Generation Systems (2025)
- Don't Lag, RAG: Training-Free Adversarial Detection Using RAG (2025)
- Learning to Erase Private Knowledge from Multi-Documents for Retrieval-Augmented Large Language Models (2025)
- A Few Large Shifts: Layer-Inconsistency Based Minimal Overhead Adversarial Example Detection (2025)
- Secure Multifaceted-RAG for Enterprise: Hybrid Knowledge Retrieval with Security Filtering (2025)
- Helping Large Language Models Protect Themselves: An Enhanced Filtering and Summarization System (2025)
- Generalizable Vision-Language Few-Shot Adaptation with Predictive Prompts and Negative Learning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 3
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper