MT-SLM-7B
MT-SLM-7B is a mixture of experts model, merging four specialized models to create a well-rounded AI capable of handling diverse tasks. It excels in coding, mathematical problem-solving, storytelling, and general-purpose chat interactions. The merging process was performed using LazyMergekit.
𧩠Component Models
MT-SLM-7B integrates four expert models:
Mathematics Expert
Finetuned for mathematical reasoning and problem-solving.Coding Expert
Finetuned for generating high-quality Python and general programming code.Chat Expert
A general-purpose conversational AI for everyday interactions.Storytelling Expert
Finetuned for generating creative and engaging stories.
The individual models contributing to this mixture are:
Chat Model:
mlabonne/AlphaMonarch-7B β A general-purpose model for most interactions.Code Model:
beowolx/CodeNinja-1.0-OpenChat-7B β A highly capable coding model.Math Model:
mlabonne/NeuralDaredevil-7B β Specialized in mathematical reasoning with strong MMLU and GMS8K scores.Role-Play & Storytelling Model:
SanjiWatsuki/Kunoichi-DPO-v2-7B β Known for high-quality storytelling and role-playing (MT-Bench score of 8.51).
This model supports an 8k context window for extended interactions.
π οΈ Model Configuration
slices:
- sources:
- model: jaiyeshchahar/ChatingDeveloper-7B-slerp
layer_range: [0, 32]
- model: jaiyeshchahar/storywriter-mathematician
layer_range: [0, 32]
merge_method: slerp
base_model: jaiyeshchahar/storywriter-mathematician
parameters:
t:
- filter: self_attn
value: [0, 0.5, 0.3, 0.7, 1]
- filter: mlp
value: [1, 0.5, 0.7, 0.3, 0]
- value: 0.5
dtype: bfloat16
---
license: apache-2.0
base_model:
- jaiyeshchahar/ChatingDeveloper-7B-slerp
- jaiyeshchahar/storywriter-mathematician
tags:
- merge
- mergekit
- lazymergekit
- jaiyeshchahar/ChatingDeveloper-7B-slerp
- jaiyeshchahar/storywriter-mathematician
---
This model supports an 8k context window for extended interactions.
π Usage
1. Install Dependencies
Install the required libraries using pip:
pip install -qU transformers accelerate
2. Load the Model and Generate Text
Below is an example Python script to load the model and generate text:
from transformers import AutoTokenizer
import transformers
import torch
# Specify the model name
model = "jaiyeshchahar/MT-SLM-7B"
# Define your conversation as a list of messages
messages = [{"role": "user", "content": "What is a large language model?"}]
# Initialize the tokenizer and prepare the prompt
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
# Set up the text generation pipeline
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
# Generate text output
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
3. Example Use Cases
- Article Explanation: Summarize and explain complex articles.
- Coding Assistance: Generate, debug, and explain Python code.
- Mathematical Problem Solving: Handle computations and logical reasoning.
- Creative Storytelling: Craft engaging narratives and role-play scenarios.
π― Conclusion
MT-SLM-7B is a powerful, well-rounded assistant that leverages a mixture of expert models to deliver exceptional performance across various domains. Whether you need a reliable coding companion, a math tutor, or a creative storyteller, this model is designed to meet your needs. Try it out and experience the full range of its capabilities!
Happy generating! π
- Downloads last month
- 13