File size: 1,762 Bytes
8eb0340 d6ea606 8137ed7 a5afb8c d6ea606 a5afb8c d6ea606 a5afb8c d6ea606 a5afb8c d6ea606 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
language:
- en
base_model:
- Qwen/Qwen2.5-Math-7B-Instruct
- rombodawg/Rombos-Coder-V2.5-Qwen-7b
- rombodawg/Rombos-LLM-V2.5-Qwen-7b
---
# Utility 19B MoE (3x7B)
This Mixture-of-Experts model is the combination of the following:
1. [rombodawg/Rombos-LLM-V2.5-Qwen-7b](https://huggingface.co/rombodawg/Rombos-LLM-V2.5-Qwen-7b)
2. [rombodawg/Rombos-Coder-V2.5-Qwen-7b](https://huggingface.co/rombodawg/Rombos-Coder-V2.5-Qwen-7b)
3. [Qwen/Qwen2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct)
It is created using the following `mergekit-moe` config:
```yaml
base_model: rombodawg/Rombos-LLM-V2.5-Qwen-7b
gate_mode: hidden
dtype: bfloat16
experts:
- source_model: Qwen/Qwen2.5-Math-7B-Instruct
positive_prompts:
- "Solve the equation"
- "Derive the formula"
- "Given the value x, solve for y"
- "Find a function that models this"
- "Find the integral of the function"
- "Find the first order derivative"
- "What is the answer to this math question"
- source_model: rombodawg/Rombos-Coder-V2.5-Qwen-7b
positive_prompts:
- "Write a python program"
- "Write a java program"
- "Write a C++ program"
- "Create a quicksort program"
- "Implement a library that does"
- "How can I do this in Python"
- "How can I do this in Java"
- "How can I do this in C++"
- "How can I do this in Javascript"
- "Create a website with HTML"
shared_experts:
- source_model: rombodawg/Rombos-LLM-V2.5-Qwen-7b
positive_prompts:
- "Hello, who are you?"
- "I need help with"
- "Can you explain"
- "Help me with this"
residual_scale: 0.1 # downweight output from shared expert to prevent overcooking the model
``` |