Updated model can't stop generation

#47

by JackBAI - opened Apr 2

Apr 2

I finetuned the 27b model on a single prompt for a while, and find that the generation will be exactly the same after overfit, but after generating the ground truth, the model does not stop generation. Does anyone has the same observation? I am using Q-LoRA with deepspeed.

GopiUppari

Google org Apr 3

Hi @JackBAI ,

Overfitting just means the model has memorized the content, it doesn’t necessarily know when to stop generating. To fix this, make sure to set the following:

eos_token_id (the token that tells the model where to stop), and

max_new_tokens (to limit the length of the generated output).

These settings help control the generation and prevent the model from continuing unnecessarily.

Thank you.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment