ELISARCyberAIEdge7B-LoRA-GGUF

Offline-ready, quantized LLaMA edge model for cybersecurity use cases

📄 Paper Title

ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI

👤 Authors

Sabri ALLANI, PhD – AI & Cybersecurity Expert
Karam BOU-CHAAYA, PhD – AI & Cybersecurity Expert
Helmi RAIS – Global Practice Lead, Expleo France

📅 Date

May 31, 2025

🔗 Model Repository

https://huggingface.co/sallani/ELISARCyberAIEdge7B-LoRA-GGUF

📚 Publication

This work will be published by Springer in the following book:
👉 https://link.springer.com/book/9783031935978
🗓️ Expected publication date: July 10, 2025

🧠 Summary

ELISAR is a fine-tuned LoRA model based on Mistral-7B, designed for contextualized cybersecurity risk assessment using Retrieval-Augmented Generation and Agentic AI capabilities. The model targets real-world use cases including:

Threat modeling (Blue ELISAR)
Offensive use-case generation (Red ELISAR)
GRC compliance automation (GRC ELISAR)

📌 Use Cases

ISO/IEC 42001 & NIS2 risk evaluation
Threat scenario generation
AI audit preparation and reporting
Secure AI system design
....

📖 Overview

ELISARCyberAIEdge7B-LoRA-GGUF is a LoRA-finetuned, GGUF-quantized version of the Mistral-7B backbone tailored for edge deployment in cybersecurity and blue-team AI scenarios. Developed by Dr. Sabri Sallani (PhD), this model integrates:

📥 Download model file:
➡️ Click here to download elisar_merged.gguf
(~5.13 GB GGUF quantized model for offline inference)

ELISAR - AI for Cybersecurity

Base model: Mistral-7B-v0.3 (FP16 / BF16)
LoRA adapter: sallani/ELISARCyberAIEdge7B
Quantization: Converted to GGUF format and optionally quantized to Q4_K_M (4-bit) for efficient inference on resource-constrained devices (NVIDIA T4, desktop GPUs, etc.).

This pipeline produces a single file (elisar_merged.gguf) of ~160 MiB that you can deploy offline using frameworks like llama.cpp or run through minimal Torch-based inference.

Key features:

Compact (< 5 Go) quantized GGUF file
Edge-friendly: runs on CPU or low-end GPUs with fast cold-start
Cybersecurity-tuned: trained to answer cybersecurity questions, perform log analysis, malware triage, and blue-team playbooks
Offline inference: execute entirely without internet access

🚀 Quickstart

1. Download model files

# Clone or download the GGUF file directly:
wget https://huggingface.co/sallani/ELISARCyberAIEdge7B-LoRA-GGUF/resolve/main/elisar_merged.gguf -O elisar_merged.gguf

Alternatively, using the Hugging Face Hub CLI:

pip install huggingface_hub
huggingface-cli login  # enter HF_TOKEN
huggingface-cli repo clone sallani/ELISARCyberAIEdge7B-LoRA-GGUF
cd ELISARCyberAIEdge7B-LoRA-GGUF
tree
# ├── elisar_merged.gguf
# └── README.md

💿 Installation

1. llama.cpp (Offline inference)

# Clone llama.cpp repository (if not already):
git clone --depth 1 https://github.com/ggml-org/llama.cpp.git
cd llama.cpp

# Build with GPU support (optional)
make clean
make CMAKE_CUDA=ON CMAKE_CUDA_ARCH=sm75

# Or build CPU-only:
# make

2. Python (Transformers) – Optional hybrid inference

python3 -m venv venv
source venv/bin/activate
pip install torch transformers peft

⚡️ Usage Examples

A. Offline inference with `llama.cpp`

cd llama.cpp
./main -m ../ELISARCyberAIEdge7B-LoRA-GGUF/elisar_merged.gguf -c 2048 -b 8 -t 8

B. Python / Transformers + PEFT Inference (Hybrid)

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch

model_id = "sallani/ELISARCyberAIEdge7B-LoRA-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "You are a blue-team AI assistant. Analyze the following network log for suspicious patterns: ..."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

gen_config = GenerationConfig(
    temperature=0.7,
    top_p=0.9,
    max_new_tokens=256,
)
output_ids = model.generate(**inputs, **gen_config.to_dict())
answer = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(answer)

📦 File Structure

ELISARCyberAIEdge7B-LoRA-GGUF/
├── elisar_merged.gguf
└── README.md

🔧 Model Details & Training

Base: Mistral-7B-v0.3 (7B params)
LoRA adapter: sallani/ELISARCyberAIEdge7B
Quantization: GGUF Q4_K_M, final size ~160 MiB
Training data: CVEs, SAST, security logs, blue-team playbooks
License: Apache 2.0

Developed by Dr. Sabri Sallani, PhD – Expert in Artificial Intelligence & Cybersecurity.

📜 Prompt Guidelines

Use instruction format: ### Instruction: / ### Response:
Add relevant logs/code in prompt
Not a replacement for certified analysts

📜 Citation

If you use this model or refer to the ELISAR framework in your research, please cite:

@incollection{elisar2025,
  author    = {Sabri Sallani and Karam Bou-Chaaya and Helmi Rais},
  title     = {ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI},
  booktitle = {Communications in Computer and Information Science (CCIS, volume 2518)},
  publisher = {Springer},
  year      = {2025},
  note      = {To be published on July 10, 2025},
  url       = {https://link.springer.com/book/9783031935978}
}

Or simply cite:

Sallani, S., Bou-Chaaya, K., & Rais, H. (2025). ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI. In Springer Book on AI for Cybersecurity. Publication date: July 10, 2025. https://link.springer.com/book/9783031935978

💬 Support & Contact

Thank you for using ELISARCyberAIEdge7B-LoRA-GGUF – helping secure your edge AI.

sallani
/

ELISARCyberAIEdge7B-LoRA-GGUF

ELISARCyberAIEdge7B-LoRA-GGUF

📄 Paper Title

👤 Authors

📅 Date

🔗 Model Repository

📚 Publication

🧠 Summary

📌 Use Cases

📖 Overview

🚀 Quickstart

1. Download model files

💿 Installation

1. llama.cpp (Offline inference)

2. Python (Transformers) – Optional hybrid inference

⚡️ Usage Examples

A. Offline inference with `llama.cpp`

B. Python / Transformers + PEFT Inference (Hybrid)

📦 File Structure

🔧 Model Details & Training

📜 Prompt Guidelines

📜 Citation

💬 Support & Contact

Model tree for sallani/ELISARCyberAIEdge7B-LoRA-GGUF

ELISARCyberAIEdge7B-LoRA-GGUF

📄 Paper Title

👤 Authors

📅 Date

🔗 Model Repository

📚 Publication

🧠 Summary

📌 Use Cases

📖 Overview

🚀 Quickstart

1. Download model files

💿 Installation

1. llama.cpp (Offline inference)

2. Python (Transformers) – Optional hybrid inference

⚡️ Usage Examples

A. Offline inference with llama.cpp

B. Python / Transformers + PEFT Inference (Hybrid)

📦 File Structure

🔧 Model Details & Training

📜 Prompt Guidelines

📜 Citation

💬 Support & Contact

Model tree for sallani/ELISARCyberAIEdge7B-LoRA-GGUF

A. Offline inference with `llama.cpp`