How did you go about training this model? Did you encounter problem with tokenizer during training?

#1
by radna - opened

@valoomba I saw your tokenizer config and it seems the tokenizer has changed compared to the original FuseO1 model, I'm experiencing loss to 0 during training, is the tokenizer setting the cause of this? Output is just gibberish also.

Sign up or log in to comment