Files changed (1) hide show
  1. README.md +50 -38
README.md CHANGED
@@ -1,38 +1,50 @@
1
- ---
2
- license: apache-2.0
3
- datasets:
4
- - NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned
5
- language:
6
- - en
7
- base_model:
8
- - Qwen/Qwen2.5-7B
9
- pipeline_tag: text-generation
10
- tags:
11
- - unsloth
12
- - dialogue
13
- ---
14
-
15
- ChatML as always. Full-precision this time. Quants will come later.
16
-
17
- An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.
18
-
19
- ## Findings
20
-
21
- The default ChatML template causes the model to sometimes identify as an AI assistant, which we'd consider undesirable. This is probably due to the assistant/user/system markers. Future iterations will likely use our own format.
22
-
23
- We fine tuned the Qwen2.5-7B causal model as a 32-rank LoRA on the [Roleplay-Logs-Sharegpt-Ngram-Cleaned](https://huggingface.co/datasets/NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned) dataset by NewEden, specifically the first 500 rows for 3 epochs. Despite the name, it includes non-RP conversation as well.
24
-
25
- This dataset probably isn't the best, it has a few problems:
26
- - It's not the most coherent thing ever.
27
- - It contains some examples of "brainrot" and nonsensical phrase repetition, which we don't see as bad, but it seems to confuse the model a bit.
28
- - It's still partially synthetic, based on [character.ai](https://character.ai) logs, so it's bound to contain some cliched phrases from their model, which is not ideal. The goal is NOT TO replicate the character.ai model, but to build a unique conversational model that is fun to interact with.
29
-
30
- However:
31
- - It also contains a lot of interesting conversational patterns which corporate instruct models would never spit out.
32
- - After training, the model is usable and very fun to interact with. It still feels a bit undercooked, so we plan to address that.
33
-
34
- We plan to keep this dataset in future iterations (however in moderation). We plan to include dialogue scraped from Reddit, and the [Discord-Data](https://www.kaggle.com/datasets/jef1056/discord-data) dataset, and probably some other things if we consider them interesting. The next iteration will include this data.
35
-
36
- We do not plan to include instructions or synthetic data from models like GPT-4 or Claude, as those have been fine-tuned for agreeability and professional tone. Moreso, when attempting to prompt the model to write more casually, they tend to stick too hard to the guidelines provided (when a lot of them are included), or write in a stilted, cheesy and unnatural way (when the instructions provided are vague).
37
-
38
- However, we do plan to experiment with instruction following in the future 😊
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ base_model:
20
+ - Qwen/Qwen2.5-7B
21
+ pipeline_tag: text-generation
22
+ tags:
23
+ - unsloth
24
+ - dialogue
25
+ ---
26
+
27
+ ChatML as always. Full-precision this time. Quants will come later.
28
+
29
+ An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.
30
+
31
+ ## Findings
32
+
33
+ The default ChatML template causes the model to sometimes identify as an AI assistant, which we'd consider undesirable. This is probably due to the assistant/user/system markers. Future iterations will likely use our own format.
34
+
35
+ We fine tuned the Qwen2.5-7B causal model as a 32-rank LoRA on the [Roleplay-Logs-Sharegpt-Ngram-Cleaned](https://huggingface.co/datasets/NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned) dataset by NewEden, specifically the first 500 rows for 3 epochs. Despite the name, it includes non-RP conversation as well.
36
+
37
+ This dataset probably isn't the best, it has a few problems:
38
+ - It's not the most coherent thing ever.
39
+ - It contains some examples of "brainrot" and nonsensical phrase repetition, which we don't see as bad, but it seems to confuse the model a bit.
40
+ - It's still partially synthetic, based on [character.ai](https://character.ai) logs, so it's bound to contain some cliched phrases from their model, which is not ideal. The goal is NOT TO replicate the character.ai model, but to build a unique conversational model that is fun to interact with.
41
+
42
+ However:
43
+ - It also contains a lot of interesting conversational patterns which corporate instruct models would never spit out.
44
+ - After training, the model is usable and very fun to interact with. It still feels a bit undercooked, so we plan to address that.
45
+
46
+ We plan to keep this dataset in future iterations (however in moderation). We plan to include dialogue scraped from Reddit, and the [Discord-Data](https://www.kaggle.com/datasets/jef1056/discord-data) dataset, and probably some other things if we consider them interesting. The next iteration will include this data.
47
+
48
+ We do not plan to include instructions or synthetic data from models like GPT-4 or Claude, as those have been fine-tuned for agreeability and professional tone. Moreso, when attempting to prompt the model to write more casually, they tend to stick too hard to the guidelines provided (when a lot of them are included), or write in a stilted, cheesy and unnatural way (when the instructions provided are vague).
49
+
50
+ However, we do plan to experiment with instruction following in the future 😊