⚖️ Nyayadrishti-BERT-v3

Nyayadrishti-BERT-v3 is a domain-specific Sentence-BERT model fine-tuned for semantic similarity and legal information retrieval tasks, particularly focused on Indian legal texts. It supports use cases like legal research, chatbot backends, judicial QA, and vector-based search pipelines.

This version builds upon the all-MiniLM-L6-v2 architecture and is fine-tuned using MultipleNegativesRankingLoss, optimized on sentence pairs from a custom-cleaned subset of the custom currated own dataset.

🌟 Key Features

✅ Legal-Domain Fine-Tuning
Trained specifically on Indian legal judgments, acts, and petitions.
🧠 Contrastive Learning with MNRL
Uses MultipleNegativesRankingLoss for learning meaningful dense vector representations of legal queries and references.
📈 Evaluation Results
- Accuracy@1 / Precision@1 / Recall@1: 0.90
- Accuracy@3 / Recall@3: 1.00
- Accuracy@5 / Recall@5: 1.00
- Accuracy@10 / Recall@10: 1.00
- MRR@10: 0.95
- MAP@100: 0.95
- NDCG@10: 0.963
🔌 Easily Integrates with FAISS, LangChain, and RAG
Built for modern legal NLP workflows and production-ready retrieval systems.

💼 Use Cases

🔍 Legal Semantic Search
🤖 Chatbot Retrieval Modules
📑 Judgment Preprocessing
🧾 Legal Document Clustering

🧪 Training Details

Attribute	Value
Base Model	`sentence-transformers/all-MiniLM-L6-v2`
Fine-Tuning Loss	`MultipleNegativesRankingLoss`
Epochs	3
Batch Size	32
Dataset	`own dataset`
Evaluation	Cosine Similarity-based retrieval ranking

Training performed using PyTorch and sentence-transformers on Google Colab Pro (GPU).

🗃️ Training Dataset Details

Dataset: Own custom training set based on Indian legal judgment text pairs
Size: 180 samples
Columns: sentence_0, sentence_1 (query, related legal sentence)

Token Statistics:

Field	Min	Mean	Max
sentence_0	8	15.35	34
sentence_1	11	57.56	172

Sample Pairs:

sentence_0	sentence_1
Who was the appellant's polling agent?	The appellant's polling agent Jang Bahadur Mian, P.W.6, has stated in his evidence that the jeep 23 USJ 5226 was being used for carrying electors to cast votes in favour of the respondent, that the respondent met the expenses of electors and that the jeep was seized by the District Magistrate and the police on the day of poll.
How many votes did the respondent secure?	The appellant secured 1795 votes while the respondent secured 28324 votes and was declared elected on 15.6.1977.
What does Section 10(2)(vi) of the Act state regarding depreciation?	What Section (10) (2) (vi) of the Act says is that depreciation will be allowed on the building.

🚀 Quick Start

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("rossieRuby/nyayadrishti-bert-v3")

query = "What is the punishment for bribery under IPC?"
embedding = model.encode(query)