File size: 5,259 Bytes
2e23ace
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
079a213
 
2e23ace
 
 
 
 
 
079a213
f827c67
079a213
 
 
2e23ace
 
079a213
2e23ace
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
079a213
 
 
 
 
 
 
 
 
 
 
2e23ace
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
---
license: mit
language:
- en
- ja
base_model:
- nvidia/Llama-3.1-Nemotron-70B-Instruct
pipeline_tag: text-generation
extra_gated_fields:
  Affiliation: text
  Country: country
  I want to use this model for:
    type: select
    options:
    - Research
    - Commercial
    - label: Other
      value: other
  Job title:
    type: select
    options:
    - Student
    - Research graduate
    - AI researcher
    - AI developer/engineer
    - Cybersecurity researcher
    - Reporter
    - Other
  geo: ip_location
library_name: transformers
datasets:
- trendmicro-ailab/Primus-FineWeb
tags:
- cybersecurity
---

# Llama-Primus-Nemotron-70B-Base

<img src="https://i.imgur.com/yzitCm9.jpeg" alt="Llama-Primus-Nemorton" width="60%">

- [Introduction](#introduction)
- [Benchmark Result](#benchmark-result)
- [Training Datasets](#training-datasets)
- [Acknowledgments](#acknowledgments)

## Introduction

The **Llama-Primus-Nemotron** series builds upon `nvidia/Llama-3.1-Nemotron-70B-Instruct` through continued training. Following the same methodology as described in the [Primus paper](https://arxiv.org/abs/2502.11191), we first performed pre-training on large-scale cybersecurity corpora (over **10B** tokens) to obtain **Llama-Primus-Nemotron-Base**. We then conducted supervised-finetuning and applied [DELLA](https://arxiv.org/abs/2406.11617) to merge with the original Nemotron, resulting in **Llama-Primus-Nemotron-70B-Instruct**.

_Llama-Primus-Nemotron-Base_ achieves an **11.19%** improvement in aggregate scores across several public cybersecurity benchmarks.


## Benchmark Result

### Cybersecurity

| **Metric** (5-shot, **w/o** chat template)     | **Llama-3.1-Nemotron-70B-Instruct** | **Llama-Primus-Nemotron-70B-Base** |
|-------------------------------------------|-------------------------------------|----------------------------------------|
| **CTI-Bench (MCQ)**                       | 0.6900                              | 0.7148                                 |
| **CTI-Bench (CVE → CWE)**                 | 0.6590                              | 0.7410                                 |
| **CTI-Bench (CVSS, _lower is better_)**   | 1.1893                              | 1.0281                                 |
| **CTI-Bench (ATE)**                       | 0.3905                              | 0.4540                                 |
| **CyberMetric (500)**                     | 0.9380                              | 0.9280                                 |
| **SecEval**                               | 0.7177                              | 0.7208                                 |
| **CISSP (Exam Questions)**                | 0.8527                              | 0.8703                                 |
| **_Aggregate_**                           | 3.0586                              | 3.4008  **↑11.19%** 🔥                 |


CTI-Bench(CVSS) is scored using Mean Absolute Deviation (_lower is better_), CTI-ATE uses F1 score, and the others use accuracy. The aggregate score (_Agg._) is the sum of all benchmarks, with CTI-Bench(CVSS) negated.

References:

-  **CyberMetric**: [CyberMetric: A Benchmark Dataset based on Retrieval-Augmented...](https://arxiv.org/abs/2402.07688)
-  **CTI-Bench**: [CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence](https://arxiv.org/abs/2406.07599)
-  **SecEval**: [SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models](https://xuanwuai.github.io/SecEval/)

## Training Datasets

#### Pre-training:

- **Primus-Seed-V2 (0.457B):** An enhanced version of [Primus-Seed](https://huggingface.co/datasets/trendmicro-ailab/Primus-Seed), enriched with blogs, news, books, websites, Wikipedia, MITRE and Trend Micro knowledge.  
- **Primus-FineWeb (2.57B):** Cybersecurity text filtered from FineWeb-edu-score-2. [Link](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb)  
- **Primus-Nemotron-CC (7.6B):** Cybersecurity text filtered from Nemotron-CC.

> **Note:** Datasets *Primus-Seed-V2* and *Primus-Nemotron-CC* are not yet open-sourced and are currently under discussion. Feel free to reach out if you're interested.

> **Disclaimer:** No Trend Micro customer information is included.

## About _Primus_

_Primus_ is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.

## Acknowledgments

We would like to thank **NVIDIA** for generously providing computing resources (**Taipei-1**), which enabled the training and development of this model.

## License

This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.