Commit
·
97efda3
1
Parent(s):
2969437
update readme
Browse files
README.md
CHANGED
@@ -99,6 +99,14 @@ torchrun --nproc-per-node ${MP} generate.py --ckpt-path ${SAVE_PATH} --config ${
|
|
99 |
|
100 |
|
101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
## License
|
103 |
|
104 |
This repository and the model weights are licensed under the [MIT License](LICENSE).
|
|
|
99 |
|
100 |
|
101 |
|
102 |
+
## Open-Source Kernels
|
103 |
+
|
104 |
+
For TileLang kernels with **better readability and research-purpose design**, please refer to [TileLang](https://github.com/tile-ai/tilelang/tree/main/examples/deepseek-v32).
|
105 |
+
|
106 |
+
For **high-performance CUDA kernels**, indexer logit kernels (including paged versions) are available in [DeepGEMM](https://github.com/deepseek-ai/DeepGEMM/pull/200). Sparse attention kernels are released in [FlashMLA](https://github.com/deepseek-ai/FlashMLA/pull/98).
|
107 |
+
|
108 |
+
|
109 |
+
|
110 |
## License
|
111 |
|
112 |
This repository and the model weights are licensed under the [MIT License](LICENSE).
|