Hi! I've read the original PI paper. It says they only fine-tune about 1000 steps to extend the context window. Did you tune the same steps (i.e. 1000 steps) as the original paper? Thanks!
Your need to confirm your account before you can post a new comment.