βοΈ Nyayadrishti-BERT-v3
Nyayadrishti-BERT-v3 is a domain-specific Sentence-BERT model fine-tuned for semantic similarity and legal information retrieval tasks, particularly focused on Indian legal texts. It supports use cases like legal research, chatbot backends, judicial QA, and vector-based search pipelines.
This version builds upon the all-MiniLM-L6-v2
architecture and is fine-tuned using MultipleNegativesRankingLoss, optimized on sentence pairs from a custom-cleaned subset of the custom currated own dataset.
π Key Features
β Legal-Domain Fine-Tuning
Trained specifically on Indian legal judgments, acts, and petitions.π§ Contrastive Learning with MNRL
UsesMultipleNegativesRankingLoss
for learning meaningful dense vector representations of legal queries and references.π Evaluation Results
- Accuracy@1 / Precision@1 / Recall@1: 0.90
- Accuracy@3 / Recall@3: 1.00
- Accuracy@5 / Recall@5: 1.00
- Accuracy@10 / Recall@10: 1.00
- MRR@10: 0.95
- MAP@100: 0.95
- NDCG@10: 0.963
π Easily Integrates with FAISS, LangChain, and RAG
Built for modern legal NLP workflows and production-ready retrieval systems.
πΌ Use Cases
- π Legal Semantic Search
- π€ Chatbot Retrieval Modules
- π Judgment Preprocessing
- π§Ύ Legal Document Clustering
π§ͺ Training Details
Attribute | Value |
---|---|
Base Model | sentence-transformers/all-MiniLM-L6-v2 |
Fine-Tuning Loss | MultipleNegativesRankingLoss |
Epochs | 3 |
Batch Size | 32 |
Dataset | own dataset |
Evaluation | Cosine Similarity-based retrieval ranking |
Training performed using PyTorch and sentence-transformers
on Google Colab Pro (GPU).
ποΈ Training Dataset Details
Dataset: Own custom training set based on Indian legal judgment text pairs
Size: 180 samples
Columns: sentence_0
, sentence_1
(query, related legal sentence)
Token Statistics:
Field | Min | Mean | Max |
---|---|---|---|
sentence_0 | 8 | 15.35 | 34 |
sentence_1 | 11 | 57.56 | 172 |
Sample Pairs:
sentence_0 | sentence_1 |
---|---|
Who was the appellant's polling agent? | The appellant's polling agent Jang Bahadur Mian, P.W.6, has stated in his evidence that the jeep 23 USJ 5226 was being used for carrying electors to cast votes in favour of the respondent, that the respondent met the expenses of electors and that the jeep was seized by the District Magistrate and the police on the day of poll. |
How many votes did the respondent secure? | The appellant secured 1795 votes while the respondent secured 28324 votes and was declared elected on 15.6.1977. |
What does Section 10(2)(vi) of the Act state regarding depreciation? | What Section (10) (2) (vi) of the Act says is that depreciation will be allowed on the building. |
π Quick Start
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("rossieRuby/nyayadrishti-bert-v3")
query = "What is the punishment for bribery under IPC?"
embedding = model.encode(query)
- Downloads last month
- 113
Model tree for rossieRuby/NyayaDrishti-bert-v3
Base model
sentence-transformers/all-MiniLM-L6-v2