Could you please explain more details about fine-tuning LLaMA-2-7B to LLaMA-2-7B-32k? Such as the fine-tuning steps and batch size. Thanks!

#32
by Mooler - opened

Hi! I've read the original PI paper. It says they only fine-tune about 1000 steps to extend the context window. Did you tune the same steps (i.e. 1000 steps) as the original paper? Thanks!

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment