How did you go about training this model? Did you encounter problem with tokenizer during training?
#1
by
radna
- opened
@valoomba I saw your tokenizer config and it seems the tokenizer has changed compared to the original FuseO1 model, I'm experiencing loss to 0 during training, is the tokenizer setting the cause of this? Output is just gibberish also.