Update README.md
Browse files
README.md
CHANGED
@@ -4,15 +4,19 @@ language:
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
7 |
-
|
8 |
## SmallThinker-4BA0.6B-Instruct-GGUF
|
9 |
|
10 |
-
- GGUF models with `.gguf` suffix can used with [*llama.cpp*
|
|
|
|
|
11 |
|
12 |
-
- GGUF models with `.powerinfer.gguf` suffix are integrated with fused sparse FFN operators and sparse LM head operators. These models are only compatible to [*powerinfer* framwork](https://github.com/SJTU-IPADS/PowerInfer/tree/main/smallthinker).
|
13 |
|
14 |
## Introduction
|
15 |
|
|
|
|
|
|
|
|
|
16 |
SmallThinker is a family of **on-device native** Mixture-of-Experts (MoE) language models specially designed for local deployment,
|
17 |
co-developed by the **IPADS and School of AI at Shanghai Jiao Tong University** and **Zenergize AI**.
|
18 |
Designed from the ground up for resource-constrained environments,
|
@@ -20,6 +24,9 @@ SmallThinker brings powerful, private, and low-latency AI directly to your perso
|
|
20 |
without relying on the cloud.
|
21 |
|
22 |
## Performance
|
|
|
|
|
|
|
23 |
| Model | MMLU | GPQA-diamond | GSM8K | MATH-500 | IFEVAL | LIVEBENCH | HUMANEVAL | Average |
|
24 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
25 |
| **SmallThinker-4BA0.6B-Instruct** | **66.11** | **31.31** | 80.02 | <u>60.60</u> | 69.69 | **42.20** | **82.32** | **61.75** |
|
@@ -72,7 +79,7 @@ You can deploy SmallThinker with offloading support using [PowerInfer](https://g
|
|
72 |
|
73 |
### Transformers
|
74 |
|
75 |
-
|
76 |
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
|
77 |
|
78 |
```python
|
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
---
|
|
|
7 |
## SmallThinker-4BA0.6B-Instruct-GGUF
|
8 |
|
9 |
+
- GGUF models with `.gguf` suffix can used with [*llama.cpp*](https://github.com/ggml-org/llama.cpp) framwork.
|
10 |
+
|
11 |
+
- GGUF models with `.powerinfer.gguf` suffix are integrated with fused sparse FFN operators and sparse LM head operators. These models are only compatible to [*powerinfer*](https://github.com/SJTU-IPADS/PowerInfer/tree/main/smallthinker) framwork.
|
12 |
|
|
|
13 |
|
14 |
## Introduction
|
15 |
|
16 |
+
<p align="center">
|
17 |
+
  🤗 <a href="https://huggingface.co/PowerInfer">Hugging Face</a>   |   🤖 <a href="https://modelscope.cn/organization/PowerInfer">ModelScope</a>   |    📑 <a href="https://github.com/SJTU-IPADS/SmallThinker/blob/main/smallthinker-technical-report.pdf">Technical Report</a>   
|
18 |
+
</p>
|
19 |
+
|
20 |
SmallThinker is a family of **on-device native** Mixture-of-Experts (MoE) language models specially designed for local deployment,
|
21 |
co-developed by the **IPADS and School of AI at Shanghai Jiao Tong University** and **Zenergize AI**.
|
22 |
Designed from the ground up for resource-constrained environments,
|
|
|
24 |
without relying on the cloud.
|
25 |
|
26 |
## Performance
|
27 |
+
|
28 |
+
Note: The model is trained mainly on English.
|
29 |
+
|
30 |
| Model | MMLU | GPQA-diamond | GSM8K | MATH-500 | IFEVAL | LIVEBENCH | HUMANEVAL | Average |
|
31 |
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|
32 |
| **SmallThinker-4BA0.6B-Instruct** | **66.11** | **31.31** | 80.02 | <u>60.60</u> | 69.69 | **42.20** | **82.32** | **61.75** |
|
|
|
79 |
|
80 |
### Transformers
|
81 |
|
82 |
+
`transformers==4.53.3` is required, we are actively working to support the latest version.
|
83 |
The following contains a code snippet illustrating how to use the model generate content based on given inputs.
|
84 |
|
85 |
```python
|