Update README.md
Browse files
README.md
CHANGED
@@ -38,7 +38,7 @@ extra_gated_fields:
|
|
38 |
|
39 |
<img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">
|
40 |
|
41 |
-
> TL;DR: Llama-Primus-Base is a foundation model based on Llama-3.1-8B-Instruct, continually pre-trained on Primus-Seed (0.2B) and Primus-FineWeb (2.57B). Primus-Seed is a high-quality, manually curated cybersecurity text dataset, while Primus-FineWeb consists of cybersecurity texts filtered from FineWeb. By pretraining on such a large-scale cybersecurity corpus, it achieves a 🚀**15.88%** improvement in aggregated scores across multiple cybersecurity benchmarks, demonstrating the effectiveness of cybersecurity-specific pretraining.
|
42 |
|
43 |
**🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
|
44 |
|
|
|
38 |
|
39 |
<img src="https://i.imgur.com/PtqeTZw.png" alt="Primus Overview" width="60%">
|
40 |
|
41 |
+
> TL;DR: Llama-Primus-Base is a foundation model based on Llama-3.1-8B-Instruct, continually pre-trained on Primus-Seed (0.2B) and Primus-FineWeb (2.57B). Primus-Seed is a high-quality, manually curated cybersecurity text dataset, while Primus-FineWeb consists of cybersecurity texts filtered from FineWeb, a refined version of Common Crawl. By pretraining on such a large-scale cybersecurity corpus, it achieves a 🚀**15.88%** improvement in aggregated scores across multiple cybersecurity benchmarks, demonstrating the effectiveness of cybersecurity-specific pretraining.
|
42 |
|
43 |
**🔥 For more details, please refer to the paper: [[📄Paper]](https://arxiv.org/abs/2502.11191).**
|
44 |
|