Please Help | Model Starts Repeating Itself
Hey, thank you so much for making this model, after reading your guide I really felt compelled to download it and use it in SillyTavern. I was like, finally, we have a proper horror model that doesn't stray away from bad endings after using a lot of local and online models.
Unfortunately though, the model (with Q6_K quant) is repeating itself after some time using LLama3 template, your suggested system prompt, and with .08 temp and 1.1 rep penalty.
I tried to lower the temp, but it produced more GPT-ism words without repeating itself, I increased the rep penalty instead, and it was the same result (GPT-ism words but no repeated words). Temp 0 produced GPT-ism words as well.
This was the only gen I thought was going fantastically but then it started repeating the word "macabresque."
Please let me know if you can help me with it. I have an RTX 4070 Ti SUPER with 16GB of VRAM, 64GB of sys ram, I'm using Ollama with SillyTavern, and Q8 KV cache is enabled for Ollama. And as I'm writing this, I wonder if maybe the KV cache is bothering this model?
This is the generation. Yes, I shamelessly used one of your prompts and changed the gender to a man to see how this model will produce the story. :):
Me: Using the following "story idea" below, write the first scene in the novel introducing the young man. This scene should start in the middle of the action, include dialog, vivid passages, and end on a cliffhanger relevant to the story idea but it should also be unexpected. The scene should be 1000 words long and escalate in conflict and suspense and be written in first person, present tense with the point of view character being the young man.
Story idea: In a world ruled by dictatorship, a rebel young man leads a rebellion against the system. Despite the risks, he fights to overthrow the dictator and restore democracy to his town. The government executes him for treason, but he sticks to his beliefs and is responsible for starting the revolution.
Model: "Get down, now!" someone yells in my ear, roughly yanking me to the dirty cobblestone street. I land hard on my ass, my ribs crunching against the unforgiving ground. The impact jars my senses back into focus as boot heels thud mere inches away.
I twist my head up, squinting through a veil of bloodied hair. A lean, angular face looms above, eyes wild with fear and fury. "Move, fucker!" he growls through gritted teeth.
Instincts override shock. I scramble to my feet, my scraped palms leaving bloody smudges on the slick stones. We're sprinting down a pitch-black alleyway, the only faint light seeping through boarded-up windows and the occasional flicker of a guttering torch. The acrid stench of pitch and sulfur chokes my throat.
"Behind us!" someone else barks, his voice hoarse with exertion.
I pump my legs harder, nearly tripping over discarded refuse and mangled bodies strewn across the narrow passageway. The guttural snarl of pursuing boots grows louder, closing in like a pack of rabid dogs.
A meaty fist collides with my shoulder blade, sending me careening hard to the right. I manage to regain balance mere instants before my shoulder collides with the unyielding brick wall. The impact jars my already bruised ribcage.
"Keep running!" the gravelly voice bellowed behind me.
I don't need further incentive. I launch myself forward again, my heart hammering against my ribcage like a sledgehammer against anvilized steel.
We round a sharp corner, careening into an open square bathed in the sickly light of torches and bonfires. The sight sears itself into my retinas: charred bodies splayed across the cobblestones, charred meat hanging from gibbets like macabresque macabresque macabresque
(goes on and produces this word about 50 more times before stopping)
Please, if you can, let me know your thoughts on why this is happening... thanks again for making this amazing model!
I removed the KV cache command and the model still repeats itself at .8 temp and 1.1 rep penalty. :(
Lowering rep pen to 1.09 seems to have worked, but the generation doesn't sound as natural as the one above... it sounds more like ChatGPT for some weird reason... EDIT: I was able to replicate the repetition issue 4 times by now as well... it seems to really only do it at .8 temp and 1.1 rep penalty, interestingly enough, but not at 1.09 but sounds more GPT-ish. EDIT: nope, even at 1.09 it starts repeating yet again. Weird lol.
Please see this document, as this model is a class 3/4 - and needs special "ST" settings for long / multi turn usage.
Screenshots for ST settings provided in this doc.
You may also want to read the section "generational steering" too.
It appears Ollama didn't download the V2 of this model. :(
Please see this document, as this model is a class 3/4 - and needs special "ST" settings for long / multi turn usage.
Screenshots for ST settings provided in this doc.
You may also want to read the section "generational steering" too.
You can keep this as closed if you want, David, I will check your doc out, hopefully that will let me know the fix!
Just wanted to let you know David that I'm unable to download the V2 version of the model for Ollama. There are two different Q6_K versions available but both download V1 of the model. This is the only available command: ollama run hf.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF:Q6_K
Let me know if you can help. I've been scratching my head since 30 mins lol. I tried downloading the GGUF file directly but I have no idea what to do with it.
Okay, David, I manually downloaded the V2 GGUF file and created the Ollama model, it seems to be working. As for the repetition, I will do more tests and see what works for me with the temp and rep penalty settings. Thanks again!