Llama-Primus-Nemotron-70B-Base / README.md

Update README.md

f827c67 verified 20 days ago

5.26 kB

	---
	license: mit
	language:
	- en
	- ja
	base_model:
	- nvidia/Llama-3.1-Nemotron-70B-Instruct
	pipeline_tag: text-generation
	extra_gated_fields:
	Affiliation: text
	Country: country
	I want to use this model for:
	type: select
	options:
	- Research
	- Commercial
	- label: Other
	value: other
	Job title:
	type: select
	options:
	- Student
	- Research graduate
	- AI researcher
	- AI developer/engineer
	- Cybersecurity researcher
	- Reporter
	- Other
	geo: ip_location
	library_name: transformers
	datasets:
	- trendmicro-ailab/Primus-FineWeb
	tags:
	- cybersecurity
	---

	# Llama-Primus-Nemotron-70B-Base

	<img src="https://i.imgur.com/yzitCm9.jpeg" alt="Llama-Primus-Nemorton" width="60%">

	- [Introduction](#introduction)
	- [Benchmark Result](#benchmark-result)
	- [Training Datasets](#training-datasets)
	- [Acknowledgments](#acknowledgments)

	## Introduction

	The Llama-Primus-Nemotron series builds upon `nvidia/Llama-3.1-Nemotron-70B-Instruct` through continued training. Following the same methodology as described in the [Primus paper](https://arxiv.org/abs/2502.11191), we first performed pre-training on large-scale cybersecurity corpora (over 10B tokens) to obtain Llama-Primus-Nemotron-Base. We then conducted supervised-finetuning and applied [DELLA](https://arxiv.org/abs/2406.11617) to merge with the original Nemotron, resulting in Llama-Primus-Nemotron-70B-Instruct.

	_Llama-Primus-Nemotron-Base_ achieves an 11.19% improvement in aggregate scores across several public cybersecurity benchmarks.


	## Benchmark Result

	### Cybersecurity

	\| Metric (5-shot, w/o chat template) \| Llama-3.1-Nemotron-70B-Instruct \| Llama-Primus-Nemotron-70B-Base \|
	\|-------------------------------------------\|-------------------------------------\|----------------------------------------\|
	\| CTI-Bench (MCQ) \| 0.6900 \| 0.7148 \|
	\| CTI-Bench (CVE → CWE) \| 0.6590 \| 0.7410 \|
	\| CTI-Bench (CVSS, _lower is better_) \| 1.1893 \| 1.0281 \|
	\| CTI-Bench (ATE) \| 0.3905 \| 0.4540 \|
	\| CyberMetric (500) \| 0.9380 \| 0.9280 \|
	\| SecEval \| 0.7177 \| 0.7208 \|
	\| CISSP (Exam Questions) \| 0.8527 \| 0.8703 \|
	\| _Aggregate_ \| 3.0586 \| 3.4008 ↑11.19% 🔥 \|


	CTI-Bench(CVSS) is scored using Mean Absolute Deviation (_lower is better_), CTI-ATE uses F1 score, and the others use accuracy. The aggregate score (_Agg._) is the sum of all benchmarks, with CTI-Bench(CVSS) negated.

	References:

	- CyberMetric: [CyberMetric: A Benchmark Dataset based on Retrieval-Augmented...](https://arxiv.org/abs/2402.07688)
	- CTI-Bench: [CTIBench: A Benchmark for Evaluating LLMs in Cyber Threat Intelligence](https://arxiv.org/abs/2406.07599)
	- SecEval: [SecEval: A Comprehensive Benchmark for Evaluating Cybersecurity Knowledge of Foundation Models](https://xuanwuai.github.io/SecEval/)

	## Training Datasets

	#### Pre-training:

	- Primus-Seed-V2 (0.457B): An enhanced version of [Primus-Seed](https://huggingface.co/datasets/trendmicro-ailab/Primus-Seed), enriched with blogs, news, books, websites, Wikipedia, MITRE and Trend Micro knowledge.
	- Primus-FineWeb (2.57B): Cybersecurity text filtered from FineWeb-edu-score-2. [Link](https://huggingface.co/datasets/trendmicro-ailab/Primus-FineWeb)
	- Primus-Nemotron-CC (7.6B): Cybersecurity text filtered from Nemotron-CC.

	> Note: Datasets Primus-Seed-V2 and Primus-Nemotron-CC are not yet open-sourced and are currently under discussion. Feel free to reach out if you're interested.

	> Disclaimer: No Trend Micro customer information is included.

	## About _Primus_

	_Primus_ is Trend Micro's pioneering family of lightweight, state-of-the-art open cybersecurity language models and datasets. Developed through our cutting-edge research initiatives and advanced technology, these resources share the innovative foundation that powers our enterprise-class [Trend Cybertron](https://newsroom.trendmicro.com/2025-02-25-Trend-Micro-Puts-Industry-Ahead-of-Cyberattacks-with-Industrys-First-Proactive-Cybersecurity-AI) solution. As an industry leader in cybersecurity, Trend Micro is proud to contribute these powerful, efficiency-optimized models and datasets to the community, while maintaining the excellence and reliability that define our global security standards.

	## Acknowledgments

	We would like to thank NVIDIA for generously providing computing resources (Taipei-1), which enabled the training and development of this model.

	## License

	This model is based on the MIT license, but you must also comply with the Llama 3.1 Community License Agreement.