huehui
/

Discord-Micae-Hermes-3-3B-abliterated

huehui commited on Aug 3

Commit

4c16432

verified ·

1 Parent(s): de27730

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md CHANGED Viewed

@@ -59,7 +59,7 @@ print(gen(
 Discord-Micae-Hermes-3-3B is a new finetune on [NousResearch/Hermes-3-Llama-3.2-3B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B).
-The model was trained on 17 million tokens of 250 thousand Discord STX (single turn exchanges) for 6 epochs and 5.5 million tokens of 100 thousand multi-turn chains for 6 epochs at learn rate 2e-5, finishing with both datesets combined for 1 epoch at 1e-5. We used a cosine warmup with 220 warmup steps for each phase.
 ## Dataset

 Discord-Micae-Hermes-3-3B is a new finetune on [NousResearch/Hermes-3-Llama-3.2-3B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B).
+The model was trained on 17 million tokens of 250 thousand Discord STX (single turn exchanges) for 6 epochs and 5.5 million tokens of 100 thousand multi-turn chains for 6 epochs at learn rate 2e-5, finishing with both datesets combined for 1 epoch at 1e-5. We used a cosine warmup with 220 warmup steps for each phase. The LoRA adapter was trained with alpha = 32 and r = 8.
 ## Dataset