crimeraaa
commited on
Commit
·
8a7d9d6
1
Parent(s):
35b16b4
update readme to include findings
Browse files
README.md
CHANGED
@@ -14,4 +14,25 @@ tags:
|
|
14 |
|
15 |
ChatML as always. Full-precision this time. Quants will come later.
|
16 |
|
17 |
-
An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
ChatML as always. Full-precision this time. Quants will come later.
|
16 |
|
17 |
+
An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.
|
18 |
+
|
19 |
+
## Findings
|
20 |
+
|
21 |
+
The default ChatML template causes the model to sometimes identify as an AI assistant, which we'd consider undesirable. This is probably due to the assistant/user/system markers. Future iterations will likely use our own format.
|
22 |
+
|
23 |
+
We fine tuned the Qwen2.5-7B causal model as a 32-rank LoRA on the [Roleplay-Logs-Sharegpt-Ngram-Cleaned](https://huggingface.co/datasets/NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned) dataset by NewEden, specifically the first 500 rows for 3 epochs. Despite the name, it includes non-RP conversation as well.
|
24 |
+
|
25 |
+
This dataset probably isn't the best, it has a few problems:
|
26 |
+
- It's not the most coherent thing ever.
|
27 |
+
- It contains some examples of "brainrot" and nonsensical phrase repetition, which we don't see as bad, but it seems to confuse the model a bit.
|
28 |
+
- It's still partially synthetic, based on [character.ai](https://character.ai) logs, so it's bound to contain some cliched phrases from their model, which is not ideal. The goal is NOT TO replicate the character.ai model, but to build a unique conversational model that is fun to interact with.
|
29 |
+
|
30 |
+
However:
|
31 |
+
- It also contains a lot of interesting conversational patterns which corporate instruct models would never spit out.
|
32 |
+
- After training, the model is usable and very fun to interact with. It still feels a bit undercooked, so we plan to address that.
|
33 |
+
|
34 |
+
We plan to keep this dataset in future iterations (however in moderation). We plan to include dialogue scraped from Reddit, and the [Discord-Data](https://www.kaggle.com/datasets/jef1056/discord-data) dataset, and probably some other things if we consider them interesting. The next iteration will include this data.
|
35 |
+
|
36 |
+
We do not plan to include instructions or synthetic data from models like GPT-4 or Claude, as those have been fine-tuned for agreeability and professional tone. Moreso, when attempting to prompt the model to write more casually, they tend to stick too hard to the guidelines provided (when a lot of them are included), or write in a stilted, cheesy and unnatural way (when the instructions provided are vague).
|
37 |
+
|
38 |
+
However, we do plan to experiment with instruction following in the future 😊
|