|
--- |
|
license: apache-2.0 |
|
tags: |
|
- merge |
|
- mergekit |
|
- lazymergekit |
|
- hydra-project/ChatHercules-2.5-Mistral-7B |
|
- Nitral-Archive/Prima-Pastacles-7b |
|
language: |
|
- en |
|
base_model: |
|
- hydra-project/ChatHercules-2.5-Mistral-7B |
|
- Nitral-Archive/Prima-Pastacles-7b |
|
library_name: transformers |
|
--- |
|
# Mistral-2.5-Prima-Hercules-Fusion-7B |
|
|
|
**Mistral-2.5-Prima-Hercules-Fusion-7B** is a sophisticated language model crafted by merging **hydra-project/ChatHercules-2.5-Mistral-7B** with **Nitral-Archive/Prima-Pastacles-7b** using the **spherical linear interpolation (SLERP)** method. This fusion leverages the conversational depth of Hercules and the contextual adaptability of Prima, resulting in a model that excels in dynamic assistant applications and multi-turn conversations. |
|
|
|
## π Merged Models |
|
|
|
This model merge incorporates the following: |
|
|
|
- [**hydra-project/ChatHercules-2.5-Mistral-7B**](https://huggingface.co/hydra-project/ChatHercules-2.5-Mistral-7B): Serves as the primary model, renowned for its exceptional conversational abilities and robust language comprehension. |
|
- [**Nitral-Archive/Prima-Pastacles-7b**](https://huggingface.co/Nitral-Archive/Prima-Pastacles-7b): Enhances contextual adaptability and task-switching capabilities, providing intuitive context management for diverse applications. |
|
|
|
## 𧩠Merge Configuration |
|
|
|
The configuration below outlines how the models are merged using **spherical linear interpolation (SLERP)**. This method ensures a seamless blend of architectural layers from both source models, optimizing their unique strengths for enhanced performance. |
|
|
|
```yaml |
|
# Mistral-2.5-Prima-Hercules-Fusion-7B Merge Configuration |
|
slices: |
|
- sources: |
|
- model: hydra-project/ChatHercules-2.5-Mistral-7B |
|
layer_range: [0, 32] |
|
- model: Nitral-Archive/Prima-Pastacles-7b |
|
layer_range: [0, 32] |
|
merge_method: slerp |
|
base_model: hydra-project/ChatHercules-2.5-Mistral-7B |
|
parameters: |
|
t: |
|
- filter: self_attn |
|
value: [0, 0.5, 0.3, 0.7, 1] |
|
- filter: mlp |
|
value: [1, 0.5, 0.7, 0.3, 0] |
|
- value: 0.5 |
|
dtype: bfloat16 |
|
``` |
|
|
|
### Key Parameters |
|
|
|
- **Self-Attention Filtering** (`self_attn`): Modulates the blending across self-attention layers, allowing the model to balance attention mechanisms from both source models effectively. |
|
- **MLP Filtering** (`mlp`): Fine-tunes the integration within Multi-Layer Perceptrons, ensuring optimal neural network layer performance. |
|
- **Global Weight (`t.value`)**: Applies a universal interpolation factor to layers not explicitly filtered, maintaining an even blend between models. |
|
- **Data Type (`dtype`)**: Utilizes `bfloat16` to maintain computational efficiency while preserving high precision. |
|
|
|
## π Performance Highlights |
|
|
|
- **Enhanced Multi-Turn Conversation Handling**: Improved context retention facilitates more coherent and contextually aware multi-turn interactions. |
|
- **Dynamic Assistant Applications**: Excels in role-play and scenario-based interactions, providing nuanced and adaptable responses. |
|
- **Balanced Integration**: Combines the conversational depth of Hercules with the contextual adaptability of Prima for versatile performance across various tasks. |
|
|
|
## π― Use Case & Applications |
|
|
|
**Mistral-2.5-Prima-Hercules-Fusion-7B** is designed to excel in environments that demand both conversational prowess and specialized task execution. Ideal applications include: |
|
|
|
- **Advanced Conversational Agents**: Powering chatbots and virtual assistants with nuanced understanding and responsive capabilities. |
|
- **Educational Tools**: Assisting in tutoring systems, providing explanations, and facilitating interactive learning experiences. |
|
- **Content Generation**: Creating high-quality, contextually relevant content for blogs, articles, and marketing materials. |
|
- **Technical Support**: Offering precise and efficient support in specialized domains such as IT, healthcare, and finance. |
|
- **Role-Playing Scenarios**: Enhancing interactive storytelling and simulation-based training with dynamic and contextually aware responses. |
|
|
|
## π Usage |
|
|
|
To utilize **Mistral-2.5-Prima-Hercules-Fusion-7B**, follow the steps below: |
|
|
|
### Installation |
|
|
|
First, install the necessary libraries: |
|
|
|
```bash |
|
pip install -qU transformers accelerate |
|
``` |
|
|
|
### Inference |
|
|
|
Below is an example of how to load and use the model for text generation: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline |
|
import torch |
|
|
|
# Define the model name |
|
model_name = "ZeroXClem/Mistral-2.5-Prima-Hercules-Fusion-7B" |
|
|
|
# Load the tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
# Load the model |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto" |
|
) |
|
|
|
# Initialize the pipeline |
|
text_generator = pipeline( |
|
"text-generation", |
|
model=model, |
|
tokenizer=tokenizer, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto" |
|
) |
|
|
|
# Define the input prompt |
|
prompt = "Explain the significance of artificial intelligence in modern healthcare." |
|
|
|
# Generate the output |
|
outputs = text_generator( |
|
prompt, |
|
max_new_tokens=150, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_k=50, |
|
top_p=0.95 |
|
) |
|
|
|
# Print the generated text |
|
print(outputs[0]["generated_text"]) |
|
``` |
|
|
|
### Notes |
|
|
|
- **Fine-Tuning**: This merged model requires fine-tuning for optimal performance in specific applications. |
|
- **Resource Requirements**: Ensure that your environment has sufficient computational resources, especially if deploying on GPU-enabled hardware for faster inference. |
|
|
|
|
|
## π License |
|
|
|
This model is open-sourced under the **Apache-2.0 License**. |
|
|
|
## π‘ Tags |
|
|
|
- `merge` |
|
- `mergekit` |
|
- `slerp` |
|
- `Mistral` |
|
- `hydra-project/ChatHercules-2.5-Mistral-7B` |
|
- `Nitral-Archive/Prima-Pastacles-7b` |
|
|
|
--- |