youyaoching's picture
Update README.md
f827c67 verified
|
raw
history blame
5.26 kB
---
license: mit
language:
- en
- ja
base_model:
- nvidia/Llama-3.1-Nemotron-70B-Instruct
pipeline_tag: text-generation
extra_gated_fields:
Affiliation: text
Country: country
I want to use this model for:
type: select
options:
- Research
- Commercial
- label: Other
value: other
Job title:
type: select
options:
- Student
- Research graduate
- AI researcher
- AI developer/engineer
- Cybersecurity researcher
- Reporter
- Other
geo: ip_location
library_name: transformers
datasets:
- trendmicro-ailab/Primus-FineWeb
tags:
- cybersecurity
---
# Llama-Primus-Nemotron-70B-Base
<img src="https://i.imgur.com/yzitCm9.jpeg" alt="Llama-Primus-Nemorton" width="60%">
- [Introduction](#introduction)
- [Benchmark Result](#benchmark-result)
- [Training Datasets](#training-datasets)
- [Acknowledgments](#acknowledgments)
## Introduction
The **Llama-Primus-Nemotron** series builds upon `nvidia/Llama-3.1-Nemotron-70B-Instruct` through continued training. Following the same methodology as described in the [Primus paper](https://arxiv.org/abs/2502.11191), we first performed pre-training on large-scale cybersecurity corpora (over **10B** tokens) to obtain **Llama-Primus-Nemotron-Base**. We then conducted supervised-finetuning and applied [DELLA](https://arxiv.org/abs/2406.11617) to merge with the original Nemotron, resulting in **Llama-Primus-Nemotron-70B-Instruct**.
_Llama-Primus-Nemotron-Base_ achieves an **11.19%** improvement in aggregate scores across several public cybersecurity benchmarks.
## Benchmark Result
### Cybersecurity
| **Metric** (5-shot, **w/o** chat template) | **Llama-3.1-Nemotron-70B-Instruct** | **Llama-Primus-Nemotron-70B-Base** |
|-------------------------------------------|-------------------------------------|----------------------------------------|
| **CTI-Bench (MCQ)** | 0.6900 | 0.7148 |
| **CTI-Bench (CVE β†’ CWE)** | 0.6590 | 0.7410 |
| **CTI-Bench (CVSS, _lower is better_)** | 1.1893 | 1.0281 |
| **CTI-Bench (ATE)** | 0.3905 | 0.4540 |
| **CyberMetric (500)** | 0.9380 | 0.9280 |
| **SecEval** | 0.7177 | 0.7208 |
| **CISSP (Exam Questions)** | 0.8527 | 0.8703 |
| **_Aggregate_** | 3.0586 | 3.4008 **↑11.19%** πŸ”₯ |
CTI-Bench(CVSS) is scored using Mean Absolute Deviation (_lower is better_), CTI-ATE uses F1 score, and the others use accuracy. The aggregate score (_Agg._) is the sum of all benchmarks, with CTI-Bench(CVSS) negated.
References:
- **CyberMetric**: [CyberMetric: A Benchmark Dataset based on Retrieval-Augmented...](https://arxiv.org/abs/2402.07688)
- **CTI-Bench**: [CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence](https://arxiv.org/abs/2406.07599)
- **SecEval**: [SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models](https://xuanwuai.github.io/SecEval/)
## Training Datasets
#### Pre-training:
- **Primus-Seed-V2 (0.457B):** An enhanced version of [Primus-Seed](https://huggingface.co/datasets/trendmicro-ailab/Primus-Seed), enriched with blogs, news, books, websites, Wikipedia, MITRE and Trend Micro knowledge.
- **Primus-FineWeb (2.57B):** Cybersecurity text filtered from FineWeb-edu-score-2. [Link](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb)
- **Primus-Nemotron-CC (7.6B):** Cybersecurity text filtered from Nemotron-CC.
> **Note:** Datasets *Primus-Seed-V2* and *Primus-Nemotron-CC* are not yet open-sourced and are currently under discussion. Feel free to reach out if you're interested.
> **Disclaimer:** No Trend Micro customer information is included.
## About _Primus_
_Primus_ is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.
## Acknowledgments
We would like to thank **NVIDIA** for generously providing computing resources (**Taipei-1**), which enabled the training and development of this model.
## License
This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.