Discord-Micae-Hermes-3-3B
Model Description
Discord-Micae-Hermes-3-3B is a new finetune on NousResearch/Hermes-3-Llama-3.2-3B.
This model serves as a foundation for our ongoing exploration into the capabilities of human-adjacent text generation.
Training Details
- Fine-Tuning Method: LoRA merge (α = 32, r = 8 dropout = 0.1)
- Training Schedule:
- 17M tokens of 260 thousand single-turn exchanges (STX) – 6 epochs @ 2e-5
- 5.5M tokens of 101 thousand multi-turn chains – 6 epochs @ 2e-5
- Combined dataset – 1 epoch @ 1e-5
- Scheduler: Cosine schedule with 220 warmup steps per phase
- Batching: Effective size of 126 (7 per device × 18 gradient accumulation steps)
Training took place over 17 days on a single GTX 1080 (8GB).
Dataset
The model was fine-tuned on the mookiezi/Discord-OpenMicae dataset.
Intended Use
- Conversational AI research
- Experimentation with dialogue agents trained on Discord data
- Chatbots requiring casual, human-like tone
Limitations
- The model inherits potential biases from Discord-style language.
- It is not safety-aligned for deployment without moderation.
- While it does inhert knowledge from Hermes-3-3B it is not intended for factual or sensitive information retrieval.
Prompting
Micae uses the same ChatML prompt format as Hermes 3 and handles context and chat history.
<|im_start|>user
what do you think about ai?<|im_end|>
<|im_start|>assistant
I'm not a fan of AI but I can understand why people are excited to use it. It's like the first time they got an electric car, or when they were able to fly in space, that excitement is real<|im_end|>
<|im_start|>user
where do you think it's heading?<|im_end|>
<|im_start|>assistant
It will be used for good and bad things just as with other techs<|im_end|>
<|im_start|>user
what do you do use it for personally?<|im_end|>
<|im_start|>assistant
Nothing much yet, I hope to find some stuff later on.<|im_end|>
<|im_start|>user
do you like pokemon?<|im_end|>
<|im_start|>assistant
Yes I love Pokémon.<|im_end|>
<|im_start|>user
what's your favorite pokemon<|im_end|>
<|im_start|>assistant
Garchomp<|im_end|>
Generation
from transformers import pipeline
gen = pipeline("text-generation", model="mookiezi/Discord-Micae-Hermes-3-3B")
print(gen(
"<|im_start|>user\nwhat do you do?<|im_end|>\n<|im_start|>assistant\n",
max_new_tokens=100
))
License
See the Meta LLaMA 3 Community License for details.
How to cite:
If you use this model in your work, please cite both Discord-Micae-Hermes-3-3B and the base model Hermes 3:
@misc{discord-micae-hermes3b,
author = {mookiezi},
title = {Discord-Micae-Hermes-3-3B},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/mookiezi/Discord-Micae-Hermes-3-3B}}
}
@misc{teknium2024hermes3technicalreport,
title={Hermes 3 Technical Report},
author={Ryan Teknium and Jeffrey Quesnelle and Chen Guang},
year={2024},
eprint={2408.11857},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.11857}
}
- Downloads last month
- 1,484
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for mookiezi/Discord-Micae-Hermes-3-3B
Base model
NousResearch/Hermes-3-Llama-3.2-3B