--- language: - en base_model: - Qwen/Qwen2.5-Math-7B-Instruct - rombodawg/Rombos-Coder-V2.5-Qwen-7b - rombodawg/Rombos-LLM-V2.5-Qwen-7b --- # Utility 19B MoE (3x7B) This Mixture-of-Experts model is the combination of the following: 1. [rombodawg/Rombos-LLM-V2.5-Qwen-7b](https://huggingface.co/rombodawg/Rombos-LLM-V2.5-Qwen-7b) 2. [rombodawg/Rombos-Coder-V2.5-Qwen-7b](https://huggingface.co/rombodawg/Rombos-Coder-V2.5-Qwen-7b) 3. [Qwen/Qwen2.5-Math-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-7B-Instruct) It is created using the following `mergekit-moe` config: ```yaml base_model: rombodawg/Rombos-LLM-V2.5-Qwen-7b gate_mode: hidden dtype: bfloat16 experts: - source_model: Qwen/Qwen2.5-Math-7B-Instruct positive_prompts: - "Solve the equation" - "Derive the formula" - "Given the value x, solve for y" - "Find a function that models this" - "Find the integral of the function" - "Find the first order derivative" - "What is the answer to this math question" - source_model: rombodawg/Rombos-Coder-V2.5-Qwen-7b positive_prompts: - "Write a python program" - "Write a java program" - "Write a C++ program" - "Create a quicksort program" - "Implement a library that does" - "How can I do this in Python" - "How can I do this in Java" - "How can I do this in C++" - "How can I do this in Javascript" - "Create a website with HTML" shared_experts: - source_model: rombodawg/Rombos-LLM-V2.5-Qwen-7b positive_prompts: - "Hello, who are you?" - "I need help with" - "Can you explain" - "Help me with this" residual_scale: 0.1 # downweight output from shared expert to prevent overcooking the model ```