Alfitaria
/

Q25-1.5B-VeoLu

@@ -1,78 +1,92 @@
----
-base_model:
-- Qwen/Qwen2.5-1.5B-Instruct
-base_model_relation: finetune
-library_name: peft
-pipeline_tag: text-generation
-tags:
-- mergekit
-- merge
-- llama-factory
-- lora
-datasets:
-- allura-org/fujin-cleaned-stage-1
-- Dampfinchen/Creative_Writing_Multiturn
-- ToastyPigeon/SpringDragon
-- allura-org/medquad_sharegpt
-- allura-org/scienceqa_sharegpt
-- Alignment-Lab-AI/orcamath-sharegpt
----
-# Q25-1.5-VeoLu-R2
-![made with StableNoobAI-IterSPO in sd-webui-forge](veolu.png)
-[*A source of life and hope for the land.*](https://www.youtube.com/watch?v=TJRq1Ag2Wmw)
-Q25-1.5B-Veo Lu is a tiny General-Purpose Creative model, made up of a merge of bespoke finetunes on Qwen 2.5-1.5B-Instruct.
-Inspired by the success of [MN-12B-Mag Mell](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1) and [MS-Meadowlark-22B](https://huggingface.co/allura-org/MS-Meadowlark-22B), Veo Lu was trained on a healthy, balanced diet of of Internet fiction, roleplaying, adventuring, and reasoning/general knowledge.
-The components of Veo Lu are:
-* Bard (pretrain, writing): [Fujin (Cleaned/extended Rosier)](https://huggingface.co/datasets/allura-org/fujin-cleaned-stage-1)
-* Scribe (pretrain, roleplay): [Creative Writing Multiturn](https://huggingface.co/datasets/Dampfinchen/Creative_Writing_Multiturn)
-* Cartographer (pretrain, adventuring): [SpringDragon](https://huggingface.co/datasets/ToastyPigeon/SpringDragon)
-* Alchemist (SFT, science/reasoning): [ScienceQA,](https://huggingface.co/datasets/allura-org/scienceqa_sharegpt) [MedquadQA,](https://huggingface.co/datasets/allura-org/medquad_sharegpt) [Orca Math Word Problems](https://huggingface.co/datasets/Alignment-Lab-AI/orcamath-sharegpt)
-This model is capable of carrying on a scene without going completely off the rails. That being said, it only has 1.5B parameters. So please, for the love of God, *manage your expectations.*
-Since it's Qwen, use ChatML formatting. Turn the temperature down to ~0.7-0.8 and try a dash of rep-pen.
-GGUFs coming soon, but honestly, the full-precision model is 3.5GB in size. You might wanna have a go at running this unquantized with vLLM.
-```
-pip install vllm
-vllm serve Alfitaria/Q25-1.5B-VeoLu --max-model-len 16384 --max-num-seqs 1
-```
-Made by inflatebot.
-Special thanks to our friends at [Allura](https://huggingface.co/allura-org), and especially to [Auri](https://huggingface.co/AuriAetherwiing), who basically held my hand through the whole process. Her effort and enthusiasm carried this project forward.
-### Configuration
-The following YAML configuration was used to produce this model:
-```yaml
-base_model: Qwen/Qwen2.5-1.5B-Instruct
-dtype: bfloat16
-merge_method: task_arithmetic
-parameters:
-  normalize: 1.0
-slices:
-- sources:
-  - layer_range: [0, 28]
-    model: bard
-    parameters:
-      weight: 1.0
-  - layer_range: [0, 28]
-    model: scribe
-    parameters:
-      weight: 1.0
-  - layer_range: [0, 28]
-    model: cartographer
-    parameters:
-      weight: 1.0
-  - layer_range: [0, 28]
-    model: alchemist
-    parameters:
-      weight: 1.0
-  - layer_range: [0, 28]
-    model: Qwen/Qwen2.5-1.5B-Instruct
 ```

+---
+base_model:
+- Qwen/Qwen2.5-1.5B-Instruct
+base_model_relation: finetune
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- mergekit
+- merge
+- llama-factory
+- lora
+datasets:
+- allura-org/fujin-cleaned-stage-1
+- Dampfinchen/Creative_Writing_Multiturn
+- ToastyPigeon/SpringDragon
+- allura-org/medquad_sharegpt
+- allura-org/scienceqa_sharegpt
+- Alignment-Lab-AI/orcamath-sharegpt
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+---
+# Q25-1.5-VeoLu-R2
+![made with StableNoobAI-IterSPO in sd-webui-forge](veolu.png)
+[*A source of life and hope for the land.*](https://www.youtube.com/watch?v=TJRq1Ag2Wmw)
+Q25-1.5B-Veo Lu is a tiny General-Purpose Creative model, made up of a merge of bespoke finetunes on Qwen 2.5-1.5B-Instruct.
+Inspired by the success of [MN-12B-Mag Mell](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1) and [MS-Meadowlark-22B](https://huggingface.co/allura-org/MS-Meadowlark-22B), Veo Lu was trained on a healthy, balanced diet of of Internet fiction, roleplaying, adventuring, and reasoning/general knowledge.
+The components of Veo Lu are:
+* Bard (pretrain, writing): [Fujin (Cleaned/extended Rosier)](https://huggingface.co/datasets/allura-org/fujin-cleaned-stage-1)
+* Scribe (pretrain, roleplay): [Creative Writing Multiturn](https://huggingface.co/datasets/Dampfinchen/Creative_Writing_Multiturn)
+* Cartographer (pretrain, adventuring): [SpringDragon](https://huggingface.co/datasets/ToastyPigeon/SpringDragon)
+* Alchemist (SFT, science/reasoning): [ScienceQA,](https://huggingface.co/datasets/allura-org/scienceqa_sharegpt) [MedquadQA,](https://huggingface.co/datasets/allura-org/medquad_sharegpt) [Orca Math Word Problems](https://huggingface.co/datasets/Alignment-Lab-AI/orcamath-sharegpt)
+This model is capable of carrying on a scene without going completely off the rails. That being said, it only has 1.5B parameters. So please, for the love of God, *manage your expectations.*
+Since it's Qwen, use ChatML formatting. Turn the temperature down to ~0.7-0.8 and try a dash of rep-pen.
+GGUFs coming soon, but honestly, the full-precision model is 3.5GB in size. You might wanna have a go at running this unquantized with vLLM.
+```
+pip install vllm
+vllm serve Alfitaria/Q25-1.5B-VeoLu --max-model-len 16384 --max-num-seqs 1
+```
+Made by inflatebot.
+Special thanks to our friends at [Allura](https://huggingface.co/allura-org), and especially to [Auri](https://huggingface.co/AuriAetherwiing), who basically held my hand through the whole process. Her effort and enthusiasm carried this project forward.
+### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+base_model: Qwen/Qwen2.5-1.5B-Instruct
+dtype: bfloat16
+merge_method: task_arithmetic
+parameters:
+  normalize: 1.0
+slices:
+- sources:
+  - layer_range: [0, 28]
+    model: bard
+    parameters:
+      weight: 1.0
+  - layer_range: [0, 28]
+    model: scribe
+    parameters:
+      weight: 1.0
+  - layer_range: [0, 28]
+    model: cartographer
+    parameters:
+      weight: 1.0
+  - layer_range: [0, 28]
+    model: alchemist
+    parameters:
+      weight: 1.0
+  - layer_range: [0, 28]
+    model: Qwen/Qwen2.5-1.5B-Instruct
 ```