You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

ApexRetriever-Pro

A powerful 5-stage hybrid retrieval system combining sparse retrieval, dense semantic search, diversity optimization, reranking, and generative refinement.

Built for:

  • RAG pipelines
  • AI agents
  • semantic search
  • document QA
  • memory systems
  • knowledge retrieval
  • research assistants

Architecture

ApexRetriever-Pro uses a multi-stage retrieval pipeline:

Stage β‘  β€” BM25 Sparse Retrieval

Fast keyword-based retrieval using BM25.

Stage β‘‘ β€” Dense Semantic Retrieval

Semantic vector search powered by:

  • BAAI/bge-small-en-v1.5

Uses FAISS for high-speed similarity search.

Stage β‘’ β€” MMR Diversity Filtering

Maximal Marginal Relevance (MMR) improves result diversity and reduces duplicate-style retrieval.

Stage β‘£ β€” CrossEncoder Reranking

High-quality neural reranking using:

  • cross-encoder/ms-marco-MiniLM-L-6-v2

Improves relevance precision significantly.

Stage β‘€ β€” FLAN-T5 Refinement

Final answer refinement using:

  • google/flan-t5-base

Generates concise refined outputs from retrieved context.


Features

  • Hybrid sparse+dense retrieval
  • FAISS accelerated search
  • MMR diversity optimization
  • Neural reranking
  • Generative refinement
  • GPU acceleration
  • Plug-and-play pipeline
  • Lightweight deployment
  • Kaggle compatible
  • HuggingFace compatible

Repository Structure

ApexRetriever-Pro/
β”‚
β”œβ”€β”€ bi_encoder/
β”œβ”€β”€ reranker/
β”œβ”€β”€ flan_t5/
β”œβ”€β”€ pipeline.py
└── README.md

Installation

pip install -U \
    sentence-transformers \
    transformers \
    faiss-cpu \
    rank-bm25 \
    torch

Quick Start

from pipeline import ApexRetrieverPro

retriever = ApexRetrieverPro(model_dir=".")

# Example documents

docs = [
    "Python was created by Guido van Rossum.",
    "Paris is the capital of France.",
    "Transformers power modern LLMs."
]

# Build index

retriever.index_documents(docs)

# Retrieve

results = retriever.retrieve(
    "Who created Python?",
    top_k=3
)

print(results)

Example Output

[
    'Python was created by Guido van Rossum.'
]

Use Cases

  • Retrieval-Augmented Generation (RAG)
  • AI chatbots
  • Local document search
  • Agent memory systems
  • Knowledge bases
  • Research copilots
  • Semantic indexing
  • QA systems
  • Enterprise search

Performance Notes

Recommended:

  • CUDA GPU
  • 16GB+ RAM
  • Python 3.10+

Works on:

  • Kaggle
  • Colab
  • Local GPU systems
  • Linux
  • Windows

Model Components

Component Model
Dense Encoder BAAI/bge-small-en-v1.5
Reranker cross-encoder/ms-marco-MiniLM-L-6-v2
Refiner google/flan-t5-base
Vector Engine FAISS
Sparse Search BM25

License

Apache 2.0


QuantaSparkLabs

Downloads last month
-
Safetensors
Model size
1 params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support