add deepseek-r1&o1-preview version

Files changed (6) hide show

README.md +21 -16
config.json +0 -1
model-00001-of-00004.safetensors +1 -1
model-00002-of-00004.safetensors +1 -1
model-00003-of-00004.safetensors +1 -1
model-00004-of-00004.safetensors +1 -1

README.md CHANGED Viewed

@@ -47,6 +47,9 @@ extra_gated_fields:
 **🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
 ## Introduction
 Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training.  To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_).  Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Reasoning**.
@@ -55,22 +58,24 @@ Large Language Models (LLMs) have demonstrated remarkable versatility in recent
 ## Cybersecurity Benchmark Results
-| Model                               | CISSP                | Avg. Tokens |
-|--------------------------------------|----------------------|-------------|
-| **w/o CoT, 5-shot**                 |                      |             |
-| Llama-3.1-8B-Instruct               | 0.7073               | 1           |
-| Llama-Primus-Merged                 | 0.7191 ↑1.67%        | 1           |
-| **w/ CoT, 0-shot**                   |                      |             |
-| Llama-3.1-8B-Instruct               | 0.7288 ↑3.03%        | 279.69      |
-| DeepSeek-R1-Distill-Llama-8B        | 0.7399 ↑4.61%        | 1542.10     |
-| Llama-Primus-Merged                 | 0.7603 ↑7.49%        | 241.92      |
-| **Finetuned on Primus-Reasoning**   |                      |             |
-| Llama-3.1-8B-Reasoning              | 0.7583 ↑7.21%        | 646.94      |
-| Llama-Primus-Reasoning              | 0.7780 ↑**10.0%**    | 726.96      |
-| ---   |                      |             |
-| o1-preview                          | 0.8035               | 1054.91     |
 Effect of _Primus-Reasoning_ fine-tuning, evaluated on CISSP. ↑ indicates the percentage improvement over Llama without CoT and in the 5-shot setting. The best improvement is highlighted in **bold**.

 **🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
+📢 News (2025/06/02): We have expanded the [Primus-Reasoning](https://huggingface.co/datasets/trendmicro-ailab/Primus-Reasoning) dataset with additional samples from DeepSeek-R1. Accordingly, we have replaced Llama-Primus-Reasoning with a new version distilled jointly from DeepSeek-R1 and o1-preview. This version achieves the best CISSP performance, with a 15.8% improvement.
 ## Introduction
 Large Language Models (LLMs) have demonstrated remarkable versatility in recent years, with promising applications in specialized domains such as finance, law, and biomedicine. However, in the domain of cybersecurity, we noticed a lack of open-source datasets specifically designed for LLM pre-training—even though much research has shown that LLMs acquire their knowledge during pre-training.  To fill this gap, we present a collection of datasets covering multiple stages of cybersecurity LLM training, including pre-training (_Primus-Seed_ and _Primus-FineWeb_), instruction fine-tuning (_Primus-Instruct_), and reasoning data for distillation (_Primus-Reasoning_).  Based on these datasets and Llama-3.1-8B-Instruct, we developed _Llama-Primus-Base_, _Llama-Primus-Merged_, and _Llama-Primus-Reasoning_. This model card is **Llama-Primus-Reasoning**.
 ## Cybersecurity Benchmark Results
+| Model                                  | CISSP                | Avg. Tokens |
+|----------------------------------------|----------------------|-------------|
+| **w/o CoT, 5-shot**                    |                      |             |
+| Llama-3.1-8B-Instruct                  | 0.7073               | 1           |
+| Llama-Primus-Merged                    | 0.7191 ↑1.67%        | 1           |
+| **w/ CoT, 0-shot**                     |                      |             |
+| Llama-3.1-8B-Instruct                  | 0.7288 ↑3.03%        | 279.69      |
+| └─ + *Distilled from o1-preview*       | 0.7583 ↑7.21%        | 646.94      |
+| └─ + *Distilled from DeepSeek-R1*      | 0.7859 ↑11.1%        | 1667.56     |
+| └─ + *Distilled from (o1 + R1)*        | 0.7780 ↑10.0%        | 1615.54     |
+| Llama-Primus-Merged                    | 0.7603 ↑7.49%        | 241.92      |
+| └─ + *Distilled from o1-preview*       | 0.7780 ↑10.0%        | 726.96      |
+| └─ + *Distilled from DeepSeek-R1*      | 0.8075 ↑14.2%        | 1483.94     |
+| └─ + *Distilled from (o1 + R1)*        | 0.8193 ↑**15.8%**    | 1467.40     |
+| **Raw Models for Comparison**         |                      |             |
+| o1-preview                             | 0.8035               | 1054.91     |
+| DeepSeek-R1                            | 0.8212               | 1229.32     |
+| DeepSeek-R1-Distill-Llama-8B          | 0.7399 ↑4.61%        | 1542.10     |
 Effect of _Primus-Reasoning_ fine-tuning, evaluated on CISSP. ↑ indicates the percentage improvement over Llama without CoT and in the 5-shot setting. The best improvement is highlighted in **bold**.

config.json CHANGED Viewed

@@ -1,5 +1,4 @@
 {
-  "_name_or_path": "/checkpoints/Primus-Christmas_264_0.75_0.25",
   "architectures": [
     "LlamaForCausalLM"
   ],

 {
   "architectures": [
     "LlamaForCausalLM"
   ],

model-00001-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6100a4ecee1314185a2a46923c808091569610a2fe1ecff9fe24a1919671a248
 size 4976698672

 version https://git-lfs.github.com/spec/v1
+oid sha256:9b0afd7cf1ce9c90b182af76b0f4788cf14417a41983a0d367d7125063bda061
 size 4976698672

model-00002-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:62b92e2b8568bc7fd674e22e8843b50d0ca13cef76464486f990aabc2828e273
 size 4999802720

 version https://git-lfs.github.com/spec/v1
+oid sha256:449710301f4779dc387c2897bd811bbd79cf8f9967086e6ee985572bf9dc19dd
 size 4999802720

model-00003-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f5d105373bd15743a1e6e8c62bbb9588dde412316dfbe4faa79c884eb7612c3e
 size 4915916176

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a2ffbd5a57e5bcc87e52abbf93c2bb892a2dad501f60bf8d5005bc1304ae475
 size 4915916176

model-00004-of-00004.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0a99b862de52bd2f68cd421121f98f43c5c818c93e672a8effc42b45b5705bc1
 size 1168138808

 version https://git-lfs.github.com/spec/v1
+oid sha256:29a292c724a079471f6ba960b935d6c577314f9345fb5cb7f8f105d835ab9ac5
 size 1168138808