huehui's picture
Upload README.md with huggingface_hub
4c16432 verified
---
tags:
- transformers
- causal-lm
- text-generation
- instruct
- chat
- fine-tuned
- merged-lora
- llama-3
- hermes
- discord-dataset
- conversational-ai
- chatml
- pytorch
- open-weights
- 3b-parameters
- abliterated
base_model:
- NousResearch/Hermes-3-Llama-3.2-3B-abliterated
model-index:
- name: Discord-Micae-Hermes-3-3B
results: []
datasets:
- mookiezi/Discord-OpenMicae
library_name: transformers
license: mit
---
<div style="display: flex; align-items: center; gap: 8px;">
<span>Try this model on Google Colab for free:</span>
<a href="https://colab.research.google.com/drive/1VuvvuvEWW-RZRO0l4xBPwMfsa1FjBjuV?usp=sharing">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/>
</a>
</div>
## Discord-Micae-Hermes-3-3B-abliterated
## Model Description
This is an uncensored version of [mookiezi/Discord-Micae-Hermes-3-3B](https://huggingface.co/mookiezi/Discord-Micae-Hermes-3-3B) created with abliteration (see [remove-refusals-with-transformers](https://github.com/Sumandora/remove-refusals-with-transformers) to know more about it).
This is a crude, proof-of-concept implementation to remove refusals from an LLM model without using TransformerLens.
![Micae](Micae.png)
## Generation
```python
from transformers import pipeline
gen = pipeline("text-generation", model="mookiezi/Discord-Micae-Hermes-3-3B-abliterated")
print(gen(
"<im_start>user\nwhat do you do?<|im_end|>",
max_new_tokens=100
))
```
## Discord-Micae-Hermes-3-3B Model Description
Discord-Micae-Hermes-3-3B is a new finetune on [NousResearch/Hermes-3-Llama-3.2-3B](https://huggingface.co/NousResearch/Hermes-3-Llama-3.2-3B).
The model was trained on 17 million tokens of 250 thousand Discord STX (single turn exchanges) for 6 epochs and 5.5 million tokens of 100 thousand multi-turn chains for 6 epochs at learn rate 2e-5, finishing with both datesets combined for 1 epoch at 1e-5. We used a cosine warmup with 220 warmup steps for each phase. The LoRA adapter was trained with alpha = 32 and r = 8.
## Dataset
This model was fine-tuned on the [mookiezi/Discord-OpenMicae](https://huggingface.co/datasets/mookiezi/Discord-OpenMicae) dataset.
## Prompting
Micae uses the same ChatML as the prompt format as Hermes 3 and handles context.
```
<|im_start|>user
what do you think about ai?<|im_end|>
<|im_start|>assistant
It's cool, but I don't know what it is.<|im_end|>
<|im_start|>user
where do you think it's heading?<|im_end|>
<|im_start|>assistant
I can only speculate and make guesses.<|im_end|>
<|im_start|>user
what do you do use it for?<|im_end|>
<|im_start|>assistant
For fun and to learn new things.<|im_end|>
<|im_start|>user
like what?<|im_end|>
<|im_start|>assistant
Like playing games or building robots.<|im_end|>
<|im_start|>user
do you like pokemon?<|im_end|>
<|im_start|>assistant
Yes, but not the actual game.<|im_end|>
<|im_start|>user
what do you mean?<|im_end|>
<|im_start|>assistant
The idea behind the game, that a person could create an entire world from nothing.<|im_end|>
```