You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

ApexRetriever-Pro

A powerful 5-stage hybrid retrieval system combining sparse retrieval, dense semantic search, diversity optimization, reranking, and generative refinement.

Built for:

RAG pipelines
AI agents
semantic search
document QA
memory systems
knowledge retrieval
research assistants

Architecture

ApexRetriever-Pro uses a multi-stage retrieval pipeline:

Stage ① — BM25 Sparse Retrieval

Fast keyword-based retrieval using BM25.

Stage ② — Dense Semantic Retrieval

Semantic vector search powered by:

BAAI/bge-small-en-v1.5

Uses FAISS for high-speed similarity search.

Stage ③ — MMR Diversity Filtering

Maximal Marginal Relevance (MMR) improves result diversity and reduces duplicate-style retrieval.

Stage ④ — CrossEncoder Reranking

High-quality neural reranking using:

cross-encoder/ms-marco-MiniLM-L-6-v2

Improves relevance precision significantly.

Stage ⑤ — FLAN-T5 Refinement

Final answer refinement using:

google/flan-t5-base

Generates concise refined outputs from retrieved context.

Features

Hybrid sparse+dense retrieval
FAISS accelerated search
MMR diversity optimization
Neural reranking
Generative refinement
GPU acceleration
Plug-and-play pipeline
Lightweight deployment
Kaggle compatible
HuggingFace compatible

Repository Structure

ApexRetriever-Pro/
│
├── bi_encoder/
├── reranker/
├── flan_t5/
├── pipeline.py
└── README.md

Installation

pip install -U \
    sentence-transformers \
    transformers \
    faiss-cpu \
    rank-bm25 \
    torch

Quick Start

from pipeline import ApexRetrieverPro

retriever = ApexRetrieverPro(model_dir=".")

# Example documents

docs = [
    "Python was created by Guido van Rossum.",
    "Paris is the capital of France.",
    "Transformers power modern LLMs."
]

# Build index

retriever.index_documents(docs)

# Retrieve

results = retriever.retrieve(
    "Who created Python?",
    top_k=3
)

print(results)

Example Output

[
    'Python was created by Guido van Rossum.'
]

Use Cases

Retrieval-Augmented Generation (RAG)
AI chatbots
Local document search
Agent memory systems
Knowledge bases
Research copilots
Semantic indexing
QA systems
Enterprise search

Performance Notes

Recommended:

CUDA GPU
16GB+ RAM
Python 3.10+

Works on:

Kaggle
Colab
Local GPU systems
Linux
Windows

Model Components

Component	Model
Dense Encoder	BAAI/bge-small-en-v1.5
Reranker	cross-encoder/ms-marco-MiniLM-L-6-v2
Refiner	google/flan-t5-base
Vector Engine	FAISS
Sparse Search	BM25

License

Apache 2.0

QuantaSparkLabs

Downloads last month: -

Safetensors

Model size

1 params

Tensor type

F32