|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- legacy-datasets/common_voice |
|
language: |
|
- en |
|
- sw |
|
metrics: |
|
- perplexity |
|
base_model: |
|
- mistralai/Mistral-Small-24B-Instruct-2501 |
|
pipeline_tag: text-generation |
|
tags: |
|
- community |
|
- surveys |
|
- engagement |
|
- referral-tracking |
|
- community-engagement |
|
--- |
|
# Stahili LLM |
|
|
|
[](LICENSE) |
|
|
|
## Overview |
|
Stahili LLM is a large language model designed for community-driven insights, localized interactions, and engagement tracking. Built with a focus on user participation, it facilitates structured data collection, analytics, and automation in survey-based applications. |
|
|
|
## Features |
|
- **Conversational AI**: Trained to understand and generate human-like text. |
|
- **Survey and Referral Optimization**: Helps track user participation and referrals. |
|
- **Customizable Workflows**: Supports integration into diverse applications. |
|
- **Multilingual Support**: Can process multiple languages, enhancing accessibility. |
|
- **Open-Source & Extensible**: Licensed under Apache 2.0, allowing modifications and contributions. |
|
|
|
## Installation |
|
To use Stahili LLM, you can either install it via `pip` or run it using Hugging Face's API: |
|
|
|
```bash |
|
pip install transformers torch |
|
``` |
|
|
|
Alternatively, load it via the Hugging Face model hub: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "itshunja/stahili" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
``` |
|
|
|
## Usage |
|
### Generating Text |
|
```python |
|
input_text = "How does Stahili optimize survey engagement?" |
|
inputs = tokenizer(input_text, return_tensors="pt") |
|
output = model.generate(**inputs, max_length=200) |
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |
|
``` |
|
|
|
### Fine-Tuning |
|
To fine-tune Stahili LLM on a specific dataset: |
|
```bash |
|
python train.py --model itshunja/stahili --dataset custom_dataset.json |
|
``` |
|
|
|
## API Integration |
|
Use the Hugging Face Inference API: |
|
```python |
|
from transformers import pipeline |
|
|
|
generator = pipeline("text-generation", model="itshunja/stahili") |
|
response = generator("Explain the Stahili rewards program.") |
|
print(response[0]['generated_text']) |
|
``` |
|
|
|
## Contributing |
|
We welcome contributions! To contribute: |
|
1. Fork this repository. |
|
2. Create a feature branch (`git checkout -b feature-name`). |
|
3. Commit changes (`git commit -m 'Add new feature'`). |
|
4. Push to your branch (`git push origin feature-name`). |
|
5. Submit a Pull Request. |
|
|
|
## License |
|
This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. |
|
|
|
## Contact |
|
For questions or support, reach out via [Hugging Face Discussions](https://huggingface.co/spaces) or contact Isaac Hunja. |