Budget forcing?
#50
by
mwettach
- opened
I see a lot of "wait..." in the output. Even after the correct solution to a relatively simple problem has been presented and checked twice, the output still goes on with "wait..." and further considerations. Possibly the authors have applied budget forcing principles from the Standford s1 model (https://github.com/simplescaling/s1, https://arxiv.org/abs/2501.19393), but did not yet find the ideal spot when to refrain from further "wait..." tokens and end the answer.