Sparse Attention about Ultra Long Texts

#8
by cizhenshi - opened

Thank you for your open-source contribution! I am trying to use your vllm to deploy a model, but it seems that the corresponding branch code does not include the sparse attention related content. Will this part not be open-sourced? Or where can I find the implementation for this part?

I have already found the corresponding implementation. Thank you for your work! In practice, it has indeed accelerated significantly!

cizhenshi changed discussion status to closed

Sign up or log in to comment