A newer version of this model is available: openai/gpt-oss-120b

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

CatalystGPT-3

CatalystGPT-3 is a specialized language model fine-tuned for Indian legal text understanding and generation. This model serves as an intelligent legal assistant capable of handling both English and Hindi queries related to Indian law, constitutional matters, and legal concepts.

Model Description

CatalystGPT-3 is designed to be a catalyst for legal research, education, and understanding in the Indian context. The model combines the conversational capabilities of its base architecture with specialized knowledge of Indian legal frameworks.

Base Model

Base Model: microsoft/DialoGPT-small
Model Type: Causal Language Model
Language(s): English, Hindi
License: Apache 2.0
Model Size: ~117M parameters

Training Data

Primary dataset: Indian legal text dataset
Additional data: Hindi-English legal terminology
Training examples: Approximately 1000+ legal text samples
Languages: English and Hindi

Intended Use

Primary Use Cases:

🏛️ Indian legal text generation and analysis
❓ Legal question answering system
📚 Educational tool for law students and researchers
📝 Legal document drafting assistance
🔍 Constitutional and statutory interpretation support

Limitations:

⚠️ This model is for educational and research purposes only
⚠️ Not a substitute for professional legal advice from qualified attorneys
⚠️ May contain biases present in training data
⚠️ Should not be used for actual legal decision-making or court proceedings
⚠️ Responses should be verified with authoritative legal sources

Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM

# Load CatalystGPT-3
tokenizer = AutoTokenizer.from_pretrained("sandylolpotty/CatalystGPT-3")
model = AutoModelForCausalLM.from_pretrained("sandylolpotty/CatalystGPT-3")

# Generate legal insights
def ask_catalyst(prompt, max_length=200):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=max_length,
        do_sample=True,
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Example usage
legal_query = "What is the role of the President of India in the legislative process?"
response = ask_catalyst(legal_query)
print(response)

Using Pipeline API

from transformers import pipeline

# Initialize CatalystGPT-3 pipeline
catalyst = pipeline('text-generation', model='sandylolpotty/CatalystGPT-3')

# Ask questions
questions = [
    "What are the fundamental duties of Indian citizens?",
    "भारत में न्यायपालिका की भूमिका क्या है?",
    "Explain the concept of judicial review in Indian context"
]

for question in questions:
    response = catalyst(question, max_length=150, do_sample=True, temperature=0.8)
    print(f"Q: {question}")
    print(f"A: {response[0]['generated_text']}")
    print("-" * 50)

Training Details

Training Procedure:

Training Epochs: 2
Batch Size: 1 (with gradient accumulation of 8)
Learning Rate: 5e-5
Optimizer: AdamW
Training Framework: Hugging Face Transformers
Gradient Checkpointing: Enabled for memory efficiency

Hardware Requirements:

Training: Optimized for both GPU and CPU training
Inference: Compatible with CPU inference
Memory: ~500MB for model loading

Performance & Capabilities

Strengths:

✅ Bilingual support (English/Hindi) for legal queries
✅ Understanding of Indian constitutional framework
✅ Knowledge of fundamental rights and duties
✅ Familiarity with Indian legal terminology
✅ Conversational interface for legal education

Areas for Improvement:

🔄 Case law citations and references
🔄 Recent legal amendments and updates
🔄 State-specific legal variations
🔄 Complex legal procedure explanations

Ethical Considerations & Disclaimers

Important Legal Disclaimers:

📋 Not Legal Advice: This AI model does not provide legal advice
⚖️ Educational Purpose: Designed for learning and research only
👨‍💼 Consult Professionals: Always consult qualified legal professionals for legal matters
📊 Verify Information: Cross-check all information with authoritative sources
🎯 Bias Awareness: Model outputs may reflect training data biases

Responsible Use:

Use for educational and research purposes
Verify all legal information independently
Do not rely on model outputs for legal decisions
Be aware of potential inaccuracies or outdated information

Example Outputs

English Query:

Input: "What is Indian constitutional law?" Output: "Indian constitutional law refers to the body of law that governs the interpretation and application of the Constitution of India. It encompasses the fundamental principles, rights, and duties outlined in the Constitution, along with judicial interpretations that shape the legal framework of the country."

Hindi Query:

Input: "भारतीय कानून क्या है?" Output: "भारतीय कानून एक व्यापक कानूनी प्रणाली है जो संविधान, कानून, और न्यायिक निर्णयों पर आधारित है। यह नागरिकों के अधिकारों और कर्तव्यों को परिभाषित करता है।"

Citation

If you use CatalystGPT-3 in your research or applications, please cite:

@model{catalystgpt3_2024,
  title={CatalystGPT-3: Fine-tuned Model for Indian Legal Text Understanding},
  author={sandylolpotty},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/sandylolpotty/CatalystGPT-3}
}

Contact & Support

For questions, issues, or collaborations:

💬 Open a discussion on this model's Hugging Face page
🐛 Report issues through the Hugging Face interface
📧 Contact the model author through Hugging Face profile

Version History

v1.0: Initial release with Indian legal text fine-tuning
Base model trained on legal datasets with English/Hindi support

CatalystGPT-3: Catalyzing Legal Understanding Through AI 🚀⚖️

Downloads last month: 16

Model tree for sandylolpotty/CatalystGPT-3

Base model

microsoft/DialoGPT-small

Finetuned

(49)

this model

sandylolpotty
/

CatalystGPT-3