GeeeekExplorer commited on
Commit
72cacec
·
1 Parent(s): 600c271

add figure

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +2 -2
  3. assets/cost.png +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/cost.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -2,7 +2,7 @@
2
  license: mit
3
  library_name: transformers
4
  base_model:
5
- - deepseek-ai/DeepSeek-V3.1-Base
6
  ---
7
  # DeepSeek-V3.2-Exp
8
 
@@ -50,7 +50,7 @@ We are excited to announce the official release of DeepSeek-V3.2-Exp, an experim
50
  This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.
51
 
52
  <div align="center">
53
- <img src="cost.jpg" >
54
  </div>
55
 
56
  - DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
 
2
  license: mit
3
  library_name: transformers
4
  base_model:
5
+ - deepseek-ai/DeepSeek-V3.2-Exp-Base
6
  ---
7
  # DeepSeek-V3.2-Exp
8
 
 
50
  This experimental release represents our ongoing research into more efficient transformer architectures, particularly focusing on improving computational efficiency when processing extended text sequences.
51
 
52
  <div align="center">
53
+ <img src="assets/cost.png" >
54
  </div>
55
 
56
  - DeepSeek Sparse Attention (DSA) achieves fine-grained sparse attention for the first time, delivering substantial improvements in long-context training and inference efficiency while maintaining virtually identical model output quality.
assets/cost.png ADDED

Git LFS Details

  • SHA256: d4b8e78d9a3220108e480bb44383b03a0d8ccbf7fc41ac113c5e67f5d7c8a44d
  • Pointer size: 131 Bytes
  • Size of remote file: 102 kB