Sparse Attention about Ultra Long Texts
#8
by
cizhenshi
- opened
Thank you for your open-source contribution! I am trying to use your vllm to deploy a model, but it seems that the corresponding branch code does not include the sparse attention related content. Will this part not be open-sourced? Or where can I find the implementation for this part?
I have already found the corresponding implementation. Thank you for your work! In practice, it has indeed accelerated significantly!
cizhenshi
changed discussion status to
closed