toasterai
/

Qwen2.5-7B-SambaV1DialogTesting-001

Text Generation

Model card Files Files and versions Community

crimeraaa commited on about 1 month ago

Commit

8a7d9d6

·

1 Parent(s): 35b16b4

update readme to include findings

Files changed (1) hide show

README.md +22 -1

README.md CHANGED Viewed

@@ -14,4 +14,25 @@ tags:
 ChatML as always. Full-precision this time. Quants will come later.
-An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.

 ChatML as always. Full-precision this time. Quants will come later.
+An experiment to make causal models more conversational. Many of them can already chat, but suffer from problems like occasional dullness, incoherence and verbatim repetition. This model DOES NOT follow assistant-style instructions and IS NOT INTENDED TO.
+## Findings
+The default ChatML template causes the model to sometimes identify as an AI assistant, which we'd consider undesirable. This is probably due to the assistant/user/system markers. Future iterations will likely use our own format.
+We fine tuned the Qwen2.5-7B causal model as a 32-rank LoRA on the [Roleplay-Logs-Sharegpt-Ngram-Cleaned](https://huggingface.co/datasets/NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned) dataset by NewEden, specifically the first 500 rows for 3 epochs. Despite the name, it includes non-RP conversation as well.
+This dataset probably isn't the best, it has a few problems:
+- It's not the most coherent thing ever.
+- It contains some examples of "brainrot" and nonsensical phrase repetition, which we don't see as bad, but it seems to confuse the model a bit.
+- It's still partially synthetic, based on [character.ai](https://character.ai) logs, so it's bound to contain some cliched phrases from their model, which is not ideal. The goal is NOT TO replicate the character.ai model, but to build a unique conversational model that is fun to interact with.
+However:
+- It also contains a lot of interesting conversational patterns which corporate instruct models would never spit out.
+- After training, the model is usable and very fun to interact with. It still feels a bit undercooked, so we plan to address that.
+We plan to keep this dataset in future iterations (however in moderation). We plan to include dialogue scraped from Reddit, and the [Discord-Data](https://www.kaggle.com/datasets/jef1056/discord-data) dataset, and probably some other things if we consider them interesting. The next iteration will include this data.
+We do not plan to include instructions or synthetic data from models like GPT-4 or Claude, as those have been fine-tuned for agreeability and professional tone. Moreso, when attempting to prompt the model to write more casually, they tend to stick too hard to the guidelines provided (when a lot of them are included), or write in a stilted, cheesy and unnatural way (when the instructions provided are vague).
+However, we do plan to experiment with instruction following in the future 😊