Sparse Attention about Ultra Long Texts

by cizhenshi - opened 6 days ago

6 days ago

Thank you for your open-source contribution! I am trying to use your vllm to deploy a model, but it seems that the corresponding branch code does not include the sparse attention related content. Will this part not be open-sourced? Or where can I find the implementation for this part?

cizhenshi

5 days ago

I have already found the corresponding implementation. Thank you for your work! In practice, it has indeed accelerated significantly!

cizhenshi changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment