General.
Not bad, I think it has potential if formatting issues are solved.
I will recommend to change one thing for the system prompt at least.
From
"Include your thought process here."
to
"Include {{char}}'s thought process here."
Otherwise you get moralizing and OOC reasoning.
Unfortunately, i most likely won't be returning to any further training anytime soon. I recommend mistral v3 over the ChatML presets, I've gotten fairly consistent results formatting wise. But as usual most of the models are fairly experimental.
Is there a way to hide reasoning from prompts? Sorry if it's a dumb question, i am kinda new to ST.
There are screenshots on the main model card repo showing how to properly set it up with ST, going back and testing both grpo (Violent/Violet) models after spending a week on qwen. Still don't see any kind of consistent formatting issues on my end asides from the super occasional hiccup of 12B oddity (tested with both chatml and mistral v3). But alas, still no new models for a while, i would have had some qwen 7/14B grpo's out. But training software took a dive somewhere in the middle of training. And didn't realize until it was far too late in the cycle. (rip ez 150$ :kek:)
@Nitral-AI Well, now, that was a tragedy. Best of luck with the retakes if you still want to push out some definitive Qwen based – Assuming the curse is lifted, haha... I am having my own battles here in another software front and sometimes it's really not fun.
I'll make the full set-up more obvious and have a note to check back the original card to make sure there aren't new caveats added in quants uploaded in the future.
I am very particular about having consistent formatting, so even if 9/10 outputs are in line, that single one that isn't really gets to me. I am biased but I just prefer a certain style and if I feel like the model isn't naturally in tune with it, I believe in changing to a model that better matches the card/rp session – Even at the cost of dealing with some sloppyness if needed.
I know some people online and irl that llm-rp who don't care much at all, it must be liberating.
Unfortunately, the lost training runs are what they are. The plan is to revisit them at some point, ideally using (qwq-32B, qwen-2.5vl-7B/14B, and eventually phi-14B if I ever get around to it).
Apologies for not being more proactive in clarifying the formatting. The reasoning block setup is definitely helpful, but the model was also trained using answer tags—they just aren’t included in the reasoning block setup in ST.
I’ve made the model card in the repository importable for those who want a working example in ST.
My best advice for this setup is to use a system prompt that encourages tagging, prefix responses with , and adjust the character’s first/greeting message to match the format shown above.
Sorry to hear about your own software challenges—I hope you’re able to resolve them soon! (now I'm off to crawl into bed after 45+ hours of memery.)
I don't understand the nature of the modules, but in terms of usage there are two problems. Repetition of text and text in Chinese. Not a constant problem, but a frequent one. I've tried different bots and customizations, and I'm using koboldcpp. Also, the eye color of the “persona”, it's mentioned, not only the persona, but also the “bot”.
Please help, I am at my wits end, I love the model but for some reason, I just can't get the context shift to work, I tried lowering the context size, the response size, enabling/disabling flash attention, MMAP, KV offload, play with layers, BMAP sizes 512 -> 256. I just can't make it work and waiting for the prompt to reprocess every single time just kills it for me.
Please help, I am at my wits end, I love the model but for some reason, I just can't get the context shift to work, I tried lowering the context size, the response size, enabling/disabling flash attention, MMAP, KV offload, play with layers, BMAP sizes 512 -> 256. I just can't make it work and waiting for the prompt to reprocess every single time just kills it for me.
@M4L1C3
Remove the weekday, time and date macros from the system prompt. It is currently {{weekday}} {{date}} {{time}}.
removing this line should do it.
@M4L1C3 -- Hitting reprocess constantly while RPing sounds terrible, and not normal.
Share:
- version of KCPP
- all the settings you're using in KCPP or the full cli command
- some hardware information like GPU and RAM
I will recommend launching it from the terminal/cli directly, like this [Using NVIDIA, --contextsize 8192, koboldcpp-v1.88]:
koboldcpp.exe --model "D:\...\Model-Quant-imat.gguf" --blasbatchsize 256 --quiet --flashattention --quantkv 0 --remotetunnel --multiuser 2 --gpulayers 99 --usecublas --port 6969 --contextsize 8192
Actually, Nitral just noted it correctly, the System Prompt has Dynamic information which will break Context-Shift. Make sure you're not inserting "memories" or "summary" early in the context (it should be at the end). Good call, if you don't mind, I'll remove the time grounding section from the presets in this repo. – Updated presets based on the main repo and removed the dynamic time part.
@M4L1C3 -- Hitting reprocess constantly while RPing sounds terrible, and not normal.
Share:
- version of KCPP
- all the settings you're using in KCPP or the full cli command
- some hardware information like GPU and RAM
I will recommend launching it from the terminal/cli directly, like this [Using NVIDIA, --contextsize 8192, koboldcpp-v1.88]:
koboldcpp.exe --model "D:\...\Model-Quant-imat.gguf" --blasbatchsize 256 --quiet --flashattention --quantkv 0 --remotetunnel --multiuser 2 --gpulayers 99 --usecublas --port 6969 --contextsize 8192
Actually, Nitral just noted it correctly, the System Prompt has Dynamic information which will break Context-Shift. Make sure you're not inserting "memories" or "summary" early in the context (it should be at the end). Good call, if you don't mind, I'll remove the time grounding section from the presets in this repo.
Dealt with this a long while back with another user during late poppy/early hathor era. (i should add a warning to the model card maybe, or just not ship them with said macros.) Either way, sorry for the inconvenience.
@M4L1C3 -- Hitting reprocess constantly while RPing sounds terrible, and not normal.
Share:
- version of KCPP
- all the settings you're using in KCPP or the full cli command
- some hardware information like GPU and RAM
I will recommend launching it from the terminal/cli directly, like this [Using NVIDIA, --contextsize 8192, koboldcpp-v1.88]:
koboldcpp.exe --model "D:\...\Model-Quant-imat.gguf" --blasbatchsize 256 --quiet --flashattention --quantkv 0 --remotetunnel --multiuser 2 --gpulayers 99 --usecublas --port 6969 --contextsize 8192
Actually, Nitral just noted it correctly, the System Prompt has Dynamic information which will break Context-Shift. Make sure you're not inserting "memories" or "summary" early in the context (it should be at the end). Good call, if you don't mind, I'll remove the time grounding section from the presets in this repo.
Dealt with this a long while back with another user during late poppy/early hathor era. (i should add a warning to the model card maybe, or just not ship them with said macros.) Either way, sorry for the inconvenience.
Ye, the reported issue was on Latest KCCP 1.88, Latest ST, I also had older staging ST, there the issue did not occur so I crossed off KCCP.
ST Stage: ChatML Master Import, Nitral Baseline preset. (Probably older Master import)
ST: ChatML but indeed system prompt seemed different and included {{time}} {{date}}. After removal the issue stopped.
No problem, I appreciate the work you do. Don't quite understand why would it break context shift, won't even speculate.