--- license: mit language: - en - ja base_model: - nvidia/Llama-3.1-Nemotron-70B-Instruct pipeline_tag: text-generation extra_gated_fields: Affiliation: text Country: country I want to use this model for: type: select options: - Research - Commercial - label: Other value: other Job title: type: select options: - Student - Research graduate - AI researcher - AI developer/engineer - Cybersecurity researcher - Reporter - Other geo: ip_location library_name: transformers datasets: - trendmicro-ailab/Primus-FineWeb tags: - cybersecurity --- # Llama-Primus-Nemotron-70B-Base Llama-Primus-Nemorton - [Introduction](#introduction) - [Benchmark Result](#benchmark-result) - [Training Datasets](#training-datasets) - [Acknowledgments](#acknowledgments) ## Introduction The **Llama-Primus-Nemotron** series builds upon `nvidia/Llama-3.1-Nemotron-70B-Instruct` through continued training. Following the same methodology as described in the [Primus paper](https://arxiv.org/abs/2502.11191), we first performed pre-training on large-scale cybersecurity corpora (over **10B** tokens) to obtain **Llama-Primus-Nemotron-Base**. We then conducted supervised-finetuning and applied [DELLA](https://arxiv.org/abs/2406.11617) to merge with the original Nemotron, resulting in **Llama-Primus-Nemotron-70B-Instruct**. _Llama-Primus-Nemotron-Base_ achieves an **11.19%** improvement in aggregate scores across several public cybersecurity benchmarks. ## Benchmark Result ### Cybersecurity | **Metric** (5-shot, **w/o** chat template) | **Llama-3.1-Nemotron-70B-Instruct** | **Llama-Primus-Nemotron-70B-Base** | |-------------------------------------------|-------------------------------------|----------------------------------------| | **CTI-Bench (MCQ)** | 0.6900 | 0.7148 | | **CTI-Bench (CVE → CWE)** | 0.6590 | 0.7410 | | **CTI-Bench (CVSS, _lower is better_)** | 1.1893 | 1.0281 | | **CTI-Bench (ATE)** | 0.3905 | 0.4540 | | **CyberMetric (500)** | 0.9380 | 0.9280 | | **SecEval** | 0.7177 | 0.7208 | | **CISSP (Exam Questions)** | 0.8527 | 0.8703 | | **_Aggregate_** | 3.0586 | 3.4008 **↑11.19%** 🔥 | CTI-Bench(CVSS) is scored using Mean Absolute Deviation (_lower is better_), CTI-ATE uses F1 score, and the others use accuracy. The aggregate score (_Agg._) is the sum of all benchmarks, with CTI-Bench(CVSS) negated. References: - **CyberMetric**: [CyberMetric: A Benchmark Dataset based on Retrieval-Augmented...](https://arxiv.org/abs/2402.07688) - **CTI-Bench**: [CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence](https://arxiv.org/abs/2406.07599) - **SecEval**: [SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models](https://xuanwuai.github.io/SecEval/) ## Training Datasets #### Pre-training: - **Primus-Seed-V2 (0.457B):** An enhanced version of [Primus-Seed](https://huggingface.co/datasets/trendmicro-ailab/Primus-Seed), enriched with blogs, news, books, websites, Wikipedia, MITRE and Trend Micro knowledge. - **Primus-FineWeb (2.57B):** Cybersecurity text filtered from FineWeb-edu-score-2. [Link](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb) - **Primus-Nemotron-CC (7.6B):** Cybersecurity text filtered from Nemotron-CC. > **Note:** Datasets *Primus-Seed-V2* and *Primus-Nemotron-CC* are not yet open-sourced and are currently under discussion. Feel free to reach out if you're interested. > **Disclaimer:** No Trend Micro customer information is included. ## About _Primus_ _Primus_ is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards. ## Acknowledgments We would like to thank **NVIDIA** for generously providing computing resources (**Taipei-1**), which enabled the training and development of this model. ## License This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.