model context width is 2560 not 2048

#73

by amitport - opened Jan 9, 2024

Jan 9, 2024

When loading the model, the weights have 2560 dim, not 2048, regardless of the input they were trained on.

If the model was not trained on samples longer than 2048, this is just a waste of memory; if it was trained on 2560 len samples, the docs need to be updated.
Which one is it?

Thank you

amitport

Jan 9, 2024

sorry mixed n_positions with n_embd

amitport changed discussion status to closed Jan 9, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment