Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,7 @@ tags:
|
|
4 |
- unsloth
|
5 |
datasets:
|
6 |
- chhatramani/nepal_civil_law_QA_v2
|
|
|
7 |
language:
|
8 |
- en
|
9 |
- ne
|
@@ -11,4 +12,127 @@ metrics:
|
|
11 |
- bleu
|
12 |
base_model:
|
13 |
- google/gemma-3-4b-it
|
14 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- unsloth
|
5 |
datasets:
|
6 |
- chhatramani/nepal_civil_law_QA_v2
|
7 |
+
- chhatramani/nepali_muluki_dewani_QA_v1
|
8 |
language:
|
9 |
- en
|
10 |
- ne
|
|
|
12 |
- bleu
|
13 |
base_model:
|
14 |
- google/gemma-3-4b-it
|
15 |
+
---
|
16 |
+
|
17 |
+
# chhatramani/Gemma3-4B-Nepal-Civilcode-v1
|
18 |
+
## Model Card
|
19 |
+
|
20 |
+
This model is a finetuned version of Google's Gemma 3.4B, specifically optimized for **Question Answering (QA)** tasks related to **Nepal Civil Law**. It has been trained on a unique bilingual dataset, enabling it to answer questions in both **Nepali and English**.
|
21 |
+
|
22 |
+
## Model Description
|
23 |
+
|
24 |
+
`chhatramani/Gemma3-4B-Nepal-Civilcode-v1` is a specialized **Large Language Model (LLM)** designed to provide accurate and relevant answers concerning the **Civil Code of Nepal**. By leveraging the powerful **Gemma 3.4B architecture** and finetuning it on a curated dataset of civil law QA pairs, this model aims to bridge the gap in legal AI resources for the Nepali context. Its **bilingual capability** makes it a versatile tool for a wider audience, including legal professionals, researchers, students, and the general public.
|
25 |
+
|
26 |
+
## Intended Use
|
27 |
+
|
28 |
+
This model is primarily intended for:
|
29 |
+
|
30 |
+
- **Question Answering**: Answering specific questions about the Nepal Civil Code in both Nepali and English.
|
31 |
+
- **Legal Information Retrieval**: Assisting in quickly finding information within the vast domain of Nepali civil law.
|
32 |
+
- **Educational Purposes**: Serving as a tool for learning and understanding the provisions of the Nepal Civil Code.
|
33 |
+
- **AI-powered Legal Assistants**: As a component in building more comprehensive legal AI applications.
|
34 |
+
|
35 |
+
## Training Data
|
36 |
+
|
37 |
+
The model was finetuned on a comprehensive bilingual dataset comprising QA pairs derived from **Nepal's Civil Law**. The datasets used are:
|
38 |
+
|
39 |
+
### Nepali Civil Law QA Dataset: `chhatramani/nepali_muluki_dewani_QA_v1`
|
40 |
+
|
41 |
+
- **Size**: 3.52K unique Question-Answering pairs.
|
42 |
+
- **Content**: Questions and answers specifically in **Nepali**, extracted from and related to the **Muluki Dewani Samhita (Civil Code)** of Nepal.
|
43 |
+
|
44 |
+
### English Civil Law QA Dataset: `chhatramani/nepal_civil_law_QA_v2`
|
45 |
+
|
46 |
+
- **Size**: 2.71K unique Question-Answering pairs.
|
47 |
+
- **Content**: Questions and answers in **English**, covering various aspects of **Nepal's Civil Law**.
|
48 |
+
|
49 |
+
The combination of these datasets ensures a robust understanding of the subject matter across both languages.
|
50 |
+
|
51 |
+
## Training Details
|
52 |
+
|
53 |
+
- **Base Model**: Google Gemma 3.4B
|
54 |
+
- **Finetuning Objective**: Question Answering
|
55 |
+
- **Average Training Loss**: 0.6
|
56 |
+
- **Framework**: Hugging Face Transformers
|
57 |
+
- **Training Hardware**: *(Optional: Add details if known, e.g., "NVIDIA A100 GPUs")*
|
58 |
+
- **Training Software**: *(Optional: Add details if known, e.g., "PyTorch, accelerate")*
|
59 |
+
|
60 |
+
## How to Use
|
61 |
+
|
62 |
+
You can easily load and use this model with the Hugging Face `transformers` library.
|
63 |
+
|
64 |
+
```python
|
65 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
66 |
+
import torch
|
67 |
+
|
68 |
+
# Load the tokenizer and model
|
69 |
+
model_name = "chhatramani/Gemma3-4B-Nepal-Civilcode-v1"
|
70 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
71 |
+
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
|
72 |
+
|
73 |
+
# Move model to GPU if available
|
74 |
+
if torch.cuda.is_available():
|
75 |
+
model = model.to("cuda")
|
76 |
+
|
77 |
+
def generate_answer(question, max_new_tokens=200):
|
78 |
+
# For instruction-tuned models, it's good practice to wrap the prompt
|
79 |
+
# in an instruction format. Adjust based on how your data was formatted.
|
80 |
+
prompt = f"Question: {question}\nAnswer:"
|
81 |
+
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
|
82 |
+
|
83 |
+
if torch.cuda.is_available():
|
84 |
+
input_ids = input_ids.to("cuda")
|
85 |
+
|
86 |
+
outputs = model.generate(
|
87 |
+
input_ids,
|
88 |
+
max_new_tokens=max_new_tokens,
|
89 |
+
num_return_sequences=1,
|
90 |
+
pad_token_id=tokenizer.eos_token_id,
|
91 |
+
do_sample=True, # Set to True for more varied responses
|
92 |
+
top_k=50,
|
93 |
+
top_p=0.95,
|
94 |
+
temperature=0.7
|
95 |
+
)
|
96 |
+
# Decode only the newly generated tokens
|
97 |
+
decoded_output = tokenizer.decode(outputs[0][len(input_ids[0]):], skip_special_tokens=True)
|
98 |
+
return decoded_output.strip()
|
99 |
+
|
100 |
+
# Example Usage in Nepali
|
101 |
+
nepali_question = "नेपालको संविधानको धारा ३१ मा के व्यवस्था छ?"
|
102 |
+
nepali_answer = generate_answer(nepali_question)
|
103 |
+
print(f"Nepali Question: {nepali_question}")
|
104 |
+
print(f"Nepali Answer: {nepali_answer}\n")
|
105 |
+
|
106 |
+
# Example Usage in English
|
107 |
+
english_question = "What are the provisions regarding property rights in the Nepal Civil Code?"
|
108 |
+
english_answer = generate_answer(english_question)
|
109 |
+
print(f"English Question: {english_question}")
|
110 |
+
print(f"English Answer: {english_answer}")
|
111 |
+
```
|
112 |
+
|
113 |
+
## Limitations and Bias
|
114 |
+
|
115 |
+
While `chhatramani/Gemma3-4B-Nepal-Civilcode-v1` is a powerful tool, it's important to acknowledge its limitations:
|
116 |
+
|
117 |
+
- **Hallucination**: Like all LLMs, this model may occasionally generate incorrect, nonsensical, or hallucinated information. Always verify critical legal information with official sources.
|
118 |
+
- **Scope**: The model's knowledge is primarily limited to the Nepal Civil Code as represented in its training data. It may not be proficient in other legal domains (e.g., criminal law, constitutional law beyond civil aspects) or recent amendments not present in the dataset.
|
119 |
+
- **Nuance and Interpretation**: Legal interpretation often requires human judgment, context, and a deep understanding of specific case facts. This model should not be used as a substitute for professional legal advice.
|
120 |
+
- **Bias**: The model may inherit biases present in its base model or the training data. Efforts have been made to use authoritative sources, but inherent biases in language or legal phrasing might still exist.
|
121 |
+
- **Performance Variation**: While bilingual, the quality of responses might vary between Nepali and English depending on the complexity of the query and the density of relevant training data in each language.
|
122 |
+
|
123 |
+
## Acknowledgements
|
124 |
+
|
125 |
+
- We extend our gratitude to **Google** for developing and open-sourcing the Gemma models.
|
126 |
+
|
127 |
+
## Citation
|
128 |
+
|
129 |
+
If you use this model in your research or application, please consider citing the original Gemma model and the datasets used for finetuning.
|
130 |
+
|
131 |
+
```bibtex
|
132 |
+
@article{gemma2024,
|
133 |
+
title={Gemma: Open Models from Google},
|
134 |
+
author={Google},
|
135 |
+
year={2024},
|
136 |
+
url={https://blog.google/technology/ai/gemma-open-models-deepmind/ }
|
137 |
+
}
|
138 |
+
```
|