tasal9 commited on
Commit
e653878
Β·
verified Β·
1 Parent(s): 54cd049

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +34 -345
README.md CHANGED
@@ -9,396 +9,85 @@ tags:
9
  - pashto
10
  - lightweight
11
  - language-model
12
- - zamai
13
  base_model: bigscience/bloomz-560m
14
  pipeline_tag: text-generation
15
  datasets:
16
  - tasal9/Pashto-Dataset-Creating-Dataset
17
- widget:
18
- - text: "Hello, how can I help you today?"
19
- example_title: "English Greeting"
20
- - text: "Ψ³Ω„Ψ§Ω… ΩˆΨ±ΩˆΨ±Ω‡ΨŒ Ϊ…Ω†Ϊ«Ω‡ یاسΨͺ؟"
21
- example_title: "Pashto Greeting"
22
- model-index:
23
- - name: pashto-base-bloom
24
- results:
25
- - task:
26
- type: text-generation
27
- name: Text Generation
28
- dataset:
29
- type: custom
30
- name: Pashto Educational Dataset
31
- metrics:
32
- - type: accuracy
33
- value: 92.5
34
- name: Overall Accuracy
35
- - type: bleu
36
- value: 0.85
37
- name: BLEU Score
38
  ---
39
 
40
  # pashto-base-bloom
41
 
42
- <div align="center">
43
- <img src="https://huggingface.co/datasets/huggingface/brand-assets/resolve/main/hf-logo.png" alt="Hugging Face" width="100"/>
44
- <h2>🌟 Part of ZamAI Pro Models Strategy</h2>
45
- <p><strong>BLOOM-based model fine-tuned for Pashto language tasks</strong></p>
46
- </div>
47
 
48
  ## 🌟 Model Overview
49
 
50
- pashto-base-bloom is an advanced AI model specifically designed for multilingual applications with specialized focus on Pashto language support. This model is part of the comprehensive **ZamAI Pro Models Strategy**, aimed at bridging language gaps and providing high-quality AI solutions for underrepresented languages.
51
 
52
- ### 🎯 Key Features
53
-
54
- - 🧠 **Advanced Architecture**: Built on bigscience/bloomz-560m
55
- - 🌐 **Multilingual Support**: Optimized for Pashto (ps) and English (en)
56
  - ⚑ **High Performance**: Optimized for production deployment
57
- - πŸ”’ **Enterprise-Grade**: Secure and reliable for business use
58
- - πŸ“± **Production-Ready**: Tested and deployed in real applications
59
- - πŸŽ“ **Educational Focus**: Designed for learning and cultural preservation
60
-
61
- ## 🎯 Use Cases & Applications
62
-
63
- This model excels in the following scenarios:
64
-
65
- - **Lightweight Applications**: Advanced text generation capabilities
66
- - **Mobile Deployment**: Advanced text generation capabilities
67
- - **Quick Prototyping**: Advanced text generation capabilities
68
- - **Educational Tools**: Advanced text generation capabilities
69
- - **Resource-Constrained Environments**: Advanced text generation capabilities
70
-
71
- ### 🌍 Real-World Applications
72
 
73
- - **πŸŽ“ Educational Platforms**: Powering Pashto language tutoring and learning systems
74
- - **πŸ“„ Business Automation**: Document processing, form analysis, and content generation
75
- - **🎀 Voice Applications**: Natural language understanding for voice assistants
76
- - **πŸ›οΈ Cultural Preservation**: Supporting Pashto language technology and digital preservation
77
- - **🌐 Translation Services**: Cross-lingual communication and content localization
78
- - **πŸ€– Chatbot Development**: Building intelligent conversational agents
79
 
80
- ## πŸ“š Quick Start
81
-
82
- ### πŸ”§ Installation
83
-
84
- ```bash
85
- pip install transformers torch huggingface_hub
86
- ```
87
-
88
- ### πŸš€ Basic Usage
89
 
90
  ```python
91
- from transformers import AutoTokenizer, AutoModelForCausalLM
92
- from huggingface_hub import InferenceClient
93
 
94
- # Method 1: Using Transformers (Local)
95
  tokenizer = AutoTokenizer.from_pretrained("tasal9/pashto-base-bloom")
96
- model = AutoModelForCausalLM.from_pretrained("tasal9/pashto-base-bloom")
97
 
98
- # Example text
99
  text = "Your input text here"
100
  inputs = tokenizer(text, return_tensors="pt")
101
-
102
- # Generate response
103
- with torch.no_grad():
104
- outputs = model.generate(
105
- **inputs,
106
- max_new_tokens=200,
107
- temperature=0.7,
108
- top_p=0.9,
109
- pad_token_id=tokenizer.eos_token_id
110
- )
111
-
112
- response = tokenizer.decode(outputs[0], skip_special_tokens=True)
113
- print(response)
114
  ```
115
 
116
- ### 🌐 Using Hugging Face Inference API
117
 
118
  ```python
119
  from huggingface_hub import InferenceClient
120
 
121
- # Initialize client
122
  client = InferenceClient(token="your_hf_token")
123
 
124
- # Generate text
125
  response = client.text_generation(
126
  model="tasal9/pashto-base-bloom",
127
  prompt="Your prompt here",
128
- max_new_tokens=200,
129
- temperature=0.7,
130
- top_p=0.9
131
- )
132
-
133
- print(response)
134
- ```
135
-
136
- ### 🎯 Specialized Usage Examples
137
-
138
- #### English Query
139
- ```python
140
- prompt = "Explain the importance of renewable energy in simple terms:"
141
- response = client.text_generation(
142
- model="tasal9/pashto-base-bloom",
143
- prompt=prompt,
144
- max_new_tokens=250,
145
- temperature=0.7
146
- )
147
- ```
148
-
149
- #### Pashto Query
150
- ```python
151
- prompt = "Ψ― Ψ¨Ψ΄ΩΎΪ“ پوښΨͺΩ†Ω‡: Ψ― Ϊ©Ψ±ΪšΩ†Ϋ ΩˆΨ±Ψ§Ω†Ϋ Ψ― Ϊ©Ψ±Ϊ©ΩΌΨ±ΩˆΩ†Ωˆ ΩΎΩ‡ Ψ§Ϊ“Ω‡ Ψͺاسو Ϊ…Ω‡ ΩΎΩˆΩ‡ یاسΨͺ؟"
152
- response = client.text_generation(
153
- model="tasal9/pashto-base-bloom",
154
- prompt=prompt,
155
- max_new_tokens=250,
156
- temperature=0.7
157
- )
158
- ```
159
-
160
- ## πŸ”§ Technical Specifications
161
-
162
- | Specification | Details |
163
- |---------------|---------|
164
- | **Model Type** | Text Generation |
165
- | **Base Model** | bigscience/bloomz-560m |
166
- | **Languages** | Pashto (ps), English (en) |
167
- | **License** | MIT |
168
- | **Context Length** | Variable (depends on base model) |
169
- | **Parameters** | Optimized for efficiency |
170
- | **Framework** | PyTorch, Transformers |
171
- | **Deployment** | HF Inference API, Local, Docker |
172
-
173
- ## πŸ“Š Performance Metrics
174
-
175
- | Metric | Score | Description |
176
- |--------|-------|-------------|
177
- | **Overall Accuracy** | 92.5% | Performance on Pashto evaluation dataset |
178
- | **BLEU Score** | 0.85 | Translation and generation quality |
179
- | **Cultural Relevance** | 95% | Appropriateness for Pashto cultural context |
180
- | **Response Time** | <200ms | Average inference time via API |
181
- | **Multilingual Score** | 89% | Cross-lingual understanding capability |
182
- | **Coherence Score** | 91% | Logical flow and consistency |
183
-
184
- ## 🌐 Interactive Demo
185
-
186
- Try the model instantly with our Gradio demos:
187
-
188
- ### 🎯 Live Demos
189
- - **[Complete Suite Demo](https://huggingface.co/spaces/tasal9/zamai-complete-suite)** - All models in one interface
190
- - **[Individual Model Demo](https://huggingface.co/spaces/tasal9/pashto-base-bloom)** - Focused interface for this model
191
-
192
- ### πŸ”— API Endpoints
193
- - **Inference API**: `https://api-inference.huggingface.co/models/tasal9/pashto-base-bloom`
194
- - **Model Hub**: `https://huggingface.co/tasal9/pashto-base-bloom`
195
-
196
- ## πŸš€ Deployment Options
197
-
198
- ### 1. 🌐 Hugging Face Inference API (Recommended)
199
- ```python
200
- from huggingface_hub import InferenceClient
201
- client = InferenceClient(token="your_token")
202
- response = client.text_generation(model="tasal9/pashto-base-bloom", prompt="Your prompt")
203
- ```
204
-
205
- ### 2. πŸ–₯️ Local Deployment
206
- ```bash
207
- # Clone the model
208
- git clone https://huggingface.co/tasal9/pashto-base-bloom
209
- cd pashto-base-bloom
210
-
211
- # Run with Python
212
- python -c "
213
- from transformers import pipeline
214
- pipe = pipeline('text-generation', model='.')
215
- print(pipe('Your prompt here'))
216
- "
217
- ```
218
-
219
- ### 3. 🐳 Docker Deployment
220
- ```dockerfile
221
- FROM python:3.9-slim
222
-
223
- RUN pip install transformers torch
224
-
225
- COPY . /app
226
- WORKDIR /app
227
-
228
- CMD ["python", "app.py"]
229
- ```
230
-
231
- ### 4. ☁️ Cloud Deployment
232
- Compatible with major cloud platforms:
233
- - **AWS SageMaker**
234
- - **Google Cloud AI Platform**
235
- - **Azure Machine Learning**
236
- - **Hugging Face Spaces**
237
-
238
- ## πŸ“ˆ Model Training & Fine-tuning
239
-
240
- ### 🎯 Training Data
241
- - **Primary Dataset**: Custom Pashto educational content
242
- - **Secondary Data**: Multilingual parallel corpora
243
- - **Domain Focus**: Educational, cultural, and conversational content
244
- - **Quality Assurance**: Human-reviewed and culturally validated
245
-
246
- ### πŸ”§ Fine-tuning Process
247
- ```python
248
- from transformers import TrainingArguments, Trainer
249
-
250
- # Example fine-tuning setup
251
- training_args = TrainingArguments(
252
- output_dir="./results",
253
- num_train_epochs=3,
254
- per_device_train_batch_size=4,
255
- per_device_eval_batch_size=4,
256
- warmup_steps=500,
257
- weight_decay=0.01,
258
- logging_dir="./logs",
259
- )
260
-
261
- # Initialize trainer
262
- trainer = Trainer(
263
- model=model,
264
- args=training_args,
265
- train_dataset=train_dataset,
266
- eval_dataset=eval_dataset,
267
  )
268
-
269
- # Start training
270
- trainer.train()
271
  ```
272
 
273
- ## 🀝 Community & Contributions
274
 
275
- ### πŸ“ Contributing
276
- We welcome contributions to improve this model:
 
 
 
277
 
278
- 1. **Data Contributions**: Share high-quality Pashto language datasets
279
- 2. **Model Improvements**: Suggest architectural enhancements or optimizations
280
- 3. **Use Case Development**: Build applications and share success stories
281
- 4. **Bug Reports**: Help us identify and fix issues
282
- 5. **Documentation**: Improve guides and examples
283
 
284
- ### 🌟 Community Projects
285
- - **Educational Apps**: Language learning applications
286
- - **Business Tools**: Document processing solutions
287
- - **Research**: Academic studies and papers
288
- - **Open Source**: Community-driven improvements
289
 
290
- ### πŸ“Š Usage Analytics
291
- - **Downloads**: Track model adoption
292
- - **Community Feedback**: User reviews and ratings
293
- - **Performance Reports**: Real-world usage statistics
294
 
295
- ## πŸ”— Related Models & Resources
296
-
297
- ### πŸ€– Other ZamAI Models
298
- - [**ZamAI-Mistral-7B-Pashto**](https://huggingface.co/tasal9/ZamAI-Mistral-7B-Pashto) - Educational tutor
299
- - [**ZamAI-Phi-3-Mini-Pashto**](https://huggingface.co/tasal9/ZamAI-Phi-3-Mini-Pashto) - Business assistant
300
- - [**ZamAI-Whisper-v3-Pashto**](https://huggingface.co/tasal9/ZamAI-Whisper-v3-Pashto) - Speech recognition
301
- - [**Multilingual-ZamAI-Embeddings**](https://huggingface.co/tasal9/Multilingual-ZamAI-Embeddings) - Text embeddings
302
- - [**ZamAI-LLaMA3-Pashto**](https://huggingface.co/tasal9/ZamAI-LLaMA3-Pashto) - Advanced chat
303
- - [**pashto-base-bloom**](https://huggingface.co/tasal9/pashto-base-bloom) - Lightweight model
304
-
305
- ### πŸ“š Datasets
306
- - [**Pashto-Dataset-Creating-Dataset**](https://huggingface.co/datasets/tasal9/Pashto-Dataset-Creating-Dataset) - Training data
307
-
308
- ### 🌐 Platform Links
309
- - **Organization**: [tasal9](https://huggingface.co/tasal9)
310
- - **Complete Demo**: [ZamAI Suite](https://huggingface.co/spaces/tasal9/zamai-complete-suite)
311
-
312
- ## πŸ“ž Support & Contact
313
-
314
- ### πŸ†˜ Getting Help
315
  - πŸ“§ **Email**: [email protected]
316
  - 🌐 **Website**: [zamai.ai](https://zamai.ai)
317
- - πŸ“– **Documentation**: [docs.zamai.ai](https://docs.zamai.ai)
318
- - πŸ’¬ **Community Forum**: [community.zamai.ai](https://community.zamai.ai)
319
- - πŸ™ **GitHub**: [github.com/zamai-ai](https://github.com/zamai-ai)
320
 
321
- ### πŸ’Ό Enterprise Support
322
- For enterprise deployments, custom fine-tuning, or integration assistance:
323
- - πŸ“§ **Enterprise**: [email protected]
324
- - πŸ“ž **Phone**: +1-XXX-XXX-XXXX
325
- - πŸ’Ό **Consulting**: [zamai.ai/consulting](https://zamai.ai/consulting)
326
 
327
- ## 🏷️ Citation
328
-
329
- If you use this model in your research or applications, please cite:
330
-
331
- ```bibtex
332
- @misc{zamai-pashto-base-bloom-2024,
333
- title={pashto-base-bloom: BLOOM-based model fine-tuned for Pashto language tasks},
334
- author={ZamAI Team},
335
- year={2024},
336
- url={https://huggingface.co/tasal9/pashto-base-bloom},
337
- note={ZamAI Pro Models Strategy - Multilingual AI Platform},
338
- publisher={Hugging Face}
339
- }
340
- ```
341
-
342
- ### πŸ“œ Academic Papers
343
- ```bibtex
344
- @article{zamai2024multilingual,
345
- title={Advancing Multilingual AI: The ZamAI Pro Models Strategy for Pashto Language Technology},
346
- author={ZamAI Research Team},
347
- journal={Journal of Multilingual AI},
348
- year={2024},
349
- volume={1},
350
- pages={1--15}
351
- }
352
- ```
353
-
354
- ## πŸ“„ License & Terms
355
-
356
- ### πŸ“‹ License
357
- This model is licensed under the **MIT License**:
358
-
359
- - βœ… **Commercial Use**: Allowed for business applications
360
- - βœ… **Modification**: Can be modified and improved
361
- - βœ… **Distribution**: Can be redistributed
362
- - βœ… **Private Use**: Allowed for personal projects
363
- - ⚠️ **Attribution Required**: Credit must be given to ZamAI
364
-
365
- ### πŸ“ Terms of Use
366
- 1. **Responsible AI**: Use ethically and responsibly
367
- 2. **No Harmful Content**: Do not generate harmful or offensive content
368
- 3. **Privacy**: Respect user privacy and data protection laws
369
- 4. **Cultural Sensitivity**: Be respectful of Pashto culture and language
370
- 5. **Compliance**: Follow local laws and regulations
371
-
372
- ### πŸ›‘οΈ Limitations & Disclaimers
373
- - Model outputs should be reviewed for accuracy
374
- - Not suitable for critical decision-making without human oversight
375
- - May have biases inherited from training data
376
- - Performance may vary across different domains
377
-
378
- ## πŸ“ˆ Changelog & Updates
379
-
380
- | Version | Date | Changes |
381
- |---------|------|---------|
382
- | **v1.0** | 2025-07-05 | Initial release with enhanced Pashto support |
383
- | **v1.1** | TBD | Performance optimizations and bug fixes |
384
- | **v2.0** | TBD | Extended language support and new features |
385
-
386
- ### πŸ”„ Update Schedule
387
- - **Monthly**: Performance monitoring and minor improvements
388
- - **Quarterly**: Feature updates and enhancements
389
- - **Annually**: Major version releases with significant improvements
390
 
391
  ---
392
 
393
- <div align="center">
394
- <h3>🌟 Part of the ZamAI Pro Models Strategy</h3>
395
- <p><strong>Transforming AI for Multilingual Applications</strong></p>
396
- <p>
397
- <a href="https://zamai.ai">🌐 Website</a> β€’
398
- <a href="https://huggingface.co/tasal9">πŸ€— Models</a> β€’
399
- <a href="https://community.zamai.ai">πŸ’¬ Community</a> β€’
400
- <a href="mailto:[email protected]">πŸ“§ Support</a>
401
- </p>
402
- <p><em>Last Updated: 2025-07-05 21:15:52 UTC</em></p>
403
- <p><em>Model Card Version: 2.0</em></p>
404
- </div>
 
9
  - pashto
10
  - lightweight
11
  - language-model
 
12
  base_model: bigscience/bloomz-560m
13
  pipeline_tag: text-generation
14
  datasets:
15
  - tasal9/Pashto-Dataset-Creating-Dataset
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
  # pashto-base-bloom
19
 
20
+ BLOOM-based model fine-tuned for Pashto language tasks
 
 
 
 
21
 
22
  ## 🌟 Model Overview
23
 
24
+ This model is part of the **ZamAI Pro Models Strategy** - a comprehensive AI platform designed for multilingual applications with specialized focus on Pashto language support.
25
 
26
+ ### Key Features
27
+ - 🧠 **Advanced AI**: Based on bigscience/bloomz-560m architecture
28
+ - 🌐 **Multilingual**: Optimized for Pashto and English
 
29
  - ⚑ **High Performance**: Optimized for production deployment
30
+ - πŸ”’ **Secure**: Enterprise-grade security and privacy
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
+ ## πŸ“š Usage
 
 
 
 
 
33
 
34
+ ### Basic Usage with Transformers
 
 
 
 
 
 
 
 
35
 
36
  ```python
37
+ from transformers import AutoTokenizer, AutoModel
 
38
 
 
39
  tokenizer = AutoTokenizer.from_pretrained("tasal9/pashto-base-bloom")
40
+ model = AutoModel.from_pretrained("tasal9/pashto-base-bloom")
41
 
42
+ # Example usage
43
  text = "Your input text here"
44
  inputs = tokenizer(text, return_tensors="pt")
45
+ outputs = model(**inputs)
 
 
 
 
 
 
 
 
 
 
 
 
46
  ```
47
 
48
+ ### Usage with Hugging Face Inference API
49
 
50
  ```python
51
  from huggingface_hub import InferenceClient
52
 
 
53
  client = InferenceClient(token="your_hf_token")
54
 
 
55
  response = client.text_generation(
56
  model="tasal9/pashto-base-bloom",
57
  prompt="Your prompt here",
58
+ max_new_tokens=200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  )
 
 
 
60
  ```
61
 
62
+ ## πŸ”§ Technical Details
63
 
64
+ - **Model Type**: text-generation
65
+ - **Base Model**: bigscience/bloomz-560m
66
+ - **Languages**: Pashto (ps), English (en)
67
+ - **License**: MIT
68
+ - **Training**: Fine-tuned on Pashto educational and cultural content
69
 
70
+ ## πŸš€ Applications
 
 
 
 
71
 
72
+ This model powers:
73
+ - **ZamAI Educational Platform**: Pashto language tutoring
74
+ - **Business Automation**: Document processing and analysis
75
+ - **Voice Assistants**: Natural language understanding
76
+ - **Cultural Preservation**: Supporting Pashto language technology
77
 
78
+ ## πŸ“ž Support
 
 
 
79
 
80
+ For support and integration assistance:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
  - πŸ“§ **Email**: [email protected]
82
  - 🌐 **Website**: [zamai.ai](https://zamai.ai)
83
+ - πŸ’¬ **Community**: [ZamAI Community](https://community.zamai.ai)
 
 
84
 
85
+ ## πŸ“„ License
 
 
 
 
86
 
87
+ Licensed under the MIT License.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
88
 
89
  ---
90
 
91
+ **Part of the ZamAI Pro Models Strategy - Transforming AI for Multilingual Applications** 🌟
92
+
93
+ *Updated: 2025-07-05 21:29:16 UTC*