Manual Activation of Thinking/Output block
Excellent job on the model... been looking for this!!!
Manual activation:
I found that using "chatml" template (manually selected) and this system prompt:
You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: {reasoning}{answer} - Process: Think first, then answer.
Activates the thinking "block" and output generation.
However, I see at the source repo the "tokenizer... json" was updated 4 days ago, so this might fix the issue with the "jinja template" / "block" reasoning activation.
Going to download source and locally quant...
Oh thanks for your comment! Could you provide me in details an example on how did you do that?
Also, didn't the model get stuck in an infinite response loop? And what parameters you set?
Hey;
Seems some of the copy/paste did not come thru (put extra spaces in the "think" tags):
SYSTEM PROMPT:
You are an AI focused on providing systematic, well-reasoned responses. Response Structure: - Format: < think >{reasoning}</ think >{answer} - Process: Think first, then answer.
USAGE:
Lmstudio ; developer mode -> entered "system prompt", set "chat template" to "chatml"
TEMPS: .6 ;
Found temps .1 ish best for solving ; but got loops sometimes ; whereas temps over 1 reduction in loops (both "thinking" and "output" loops).
These temps seems to be mistral specific, as other "thinking mistrals" work/solve best at very low temps.
(other params; Rep pen 1.1 , TopK 40 , topP .95, minP .05; Rep pen range: 64-128 (helps keep reasoning on track / quality of output)
Looping issues (output, and maybe thinking) can be filtered out using parameters like rep pen range, rep pen and/or DRY settings.
NOTE: Without system prompt; "thinking" works, followed by output... but not in a "thinking" block.