βš–οΈ Nyayadrishti-BERT-v3

Nyayadrishti-BERT-v3 is a domain-specific Sentence-BERT model fine-tuned for semantic similarity and legal information retrieval tasks, particularly focused on Indian legal texts. It supports use cases like legal research, chatbot backends, judicial QA, and vector-based search pipelines.

This version builds upon the all-MiniLM-L6-v2 architecture and is fine-tuned using MultipleNegativesRankingLoss, optimized on sentence pairs from a custom-cleaned subset of the custom currated own dataset.


🌟 Key Features

  • βœ… Legal-Domain Fine-Tuning
    Trained specifically on Indian legal judgments, acts, and petitions.

  • 🧠 Contrastive Learning with MNRL
    Uses MultipleNegativesRankingLoss for learning meaningful dense vector representations of legal queries and references.

  • πŸ“ˆ Evaluation Results

    • Accuracy@1 / Precision@1 / Recall@1: 0.90
    • Accuracy@3 / Recall@3: 1.00
    • Accuracy@5 / Recall@5: 1.00
    • Accuracy@10 / Recall@10: 1.00
    • MRR@10: 0.95
    • MAP@100: 0.95
    • NDCG@10: 0.963
  • πŸ”Œ Easily Integrates with FAISS, LangChain, and RAG
    Built for modern legal NLP workflows and production-ready retrieval systems.


πŸ’Ό Use Cases

  • πŸ” Legal Semantic Search
  • πŸ€– Chatbot Retrieval Modules
  • πŸ“‘ Judgment Preprocessing
  • 🧾 Legal Document Clustering

πŸ§ͺ Training Details

Attribute Value
Base Model sentence-transformers/all-MiniLM-L6-v2
Fine-Tuning Loss MultipleNegativesRankingLoss
Epochs 3
Batch Size 32
Dataset own dataset
Evaluation Cosine Similarity-based retrieval ranking

Training performed using PyTorch and sentence-transformers on Google Colab Pro (GPU).


πŸ—ƒοΈ Training Dataset Details

Dataset: Own custom training set based on Indian legal judgment text pairs
Size: 180 samples
Columns: sentence_0, sentence_1 (query, related legal sentence)

Token Statistics:

Field Min Mean Max
sentence_0 8 15.35 34
sentence_1 11 57.56 172

Sample Pairs:

sentence_0 sentence_1
Who was the appellant's polling agent? The appellant's polling agent Jang Bahadur Mian, P.W.6, has stated in his evidence that the jeep 23 USJ 5226 was being used for carrying electors to cast votes in favour of the respondent, that the respondent met the expenses of electors and that the jeep was seized by the District Magistrate and the police on the day of poll.
How many votes did the respondent secure? The appellant secured 1795 votes while the respondent secured 28324 votes and was declared elected on 15.6.1977.
What does Section 10(2)(vi) of the Act state regarding depreciation? What Section (10) (2) (vi) of the Act says is that depreciation will be allowed on the building.

πŸš€ Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("rossieRuby/nyayadrishti-bert-v3")

query = "What is the punishment for bribery under IPC?"
embedding = model.encode(query)
Downloads last month
113
Safetensors
Model size
22.7M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for rossieRuby/NyayaDrishti-bert-v3

Finetuned
(446)
this model