Great model

#1
by JasonNan - opened

This is a very good model. I have a question for my own understanding. When for instance using this model, in the output, which part of the output comes from which part of the model? Are they separated ? Does it mean DeepHermes-3-Llama-3-8B-Preview contributes only to the tag parts while the actual final result comes only from Dark Planet 8B? Or is it a more complex blend? (excuse my ignorance, I'm not an expert in LLM).

thank you! ;

This model embraces a mixing method at the final layers, rather than "reasoning model" totally controlling output at this point.
This lets more of the core model "shine thru" in terms of output generation step(s), and in some cases the thinking too.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment