Hi, I'm using this model via vllm to serve in my own gpu.
Wondering if you guys plan to implement a reasoning and json parser in vllm to process only the model answer and not manually discard think text etc.
think
Thanks!
· Sign up or log in to comment