ELISARCyberAIEdge7B-LoRA-GGUF

GGUF Offline-ready, quantized LLaMA edge model for cybersecurity use cases


πŸ“„ Paper Title

ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI

πŸ‘€ Authors

  • Sabri ALLANI, PhD – AI & Cybersecurity Expert
  • Karam BOU-CHAAYA, PhD – AI & Cybersecurity Expert
  • Helmi RAIS – Global Practice Lead, Expleo France

πŸ“… Date

May 31, 2025

πŸ”— Model Repository

https://huggingface.co/sallani/ELISARCyberAIEdge7B-LoRA-GGUF

πŸ“š Publication

This work will be published by Springer in the following book:
πŸ‘‰ https://link.springer.com/book/9783031935978
πŸ—“οΈ Expected publication date: July 10, 2025

🧠 Summary

ELISAR is a fine-tuned LoRA model based on Mistral-7B, designed for contextualized cybersecurity risk assessment using Retrieval-Augmented Generation and Agentic AI capabilities. The model targets real-world use cases including:

  • Threat modeling (Blue ELISAR)
  • Offensive use-case generation (Red ELISAR)
  • GRC compliance automation (GRC ELISAR)

πŸ“Œ Use Cases

  • ISO/IEC 42001 & NIS2 risk evaluation
  • Threat scenario generation
  • AI audit preparation and reporting
  • Secure AI system design
  • ....

πŸ“– Overview

ELISARCyberAIEdge7B-LoRA-GGUF is a LoRA-finetuned, GGUF-quantized version of the Mistral-7B backbone tailored for edge deployment in cybersecurity and blue-team AI scenarios. Developed by Dr. Sabri Sallani (PhD), this model integrates:

πŸ“₯ Download model file:
➑️ Click here to download elisar_merged.gguf
(~5.13 GB GGUF quantized model for offline inference)

ELISAR - AI for Cybersecurity

  1. Base model: Mistral-7B-v0.3 (FP16 / BF16)
  2. LoRA adapter: sallani/ELISARCyberAIEdge7B
  3. Quantization: Converted to GGUF format and optionally quantized to Q4_K_M (4-bit) for efficient inference on resource-constrained devices (NVIDIA T4, desktop GPUs, etc.).

This pipeline produces a single file (elisar_merged.gguf) of ~160 MiB that you can deploy offline using frameworks like llama.cpp or run through minimal Torch-based inference.

Key features:

  • Compact (< 5 Go) quantized GGUF file
  • Edge-friendly: runs on CPU or low-end GPUs with fast cold-start
  • Cybersecurity-tuned: trained to answer cybersecurity questions, perform log analysis, malware triage, and blue-team playbooks
  • Offline inference: execute entirely without internet access

πŸš€ Quickstart

1. Download model files

# Clone or download the GGUF file directly:
wget https://huggingface.co/sallani/ELISARCyberAIEdge7B-LoRA-GGUF/resolve/main/elisar_merged.gguf -O elisar_merged.gguf

Alternatively, using the Hugging Face Hub CLI:

pip install huggingface_hub
huggingface-cli login  # enter HF_TOKEN
huggingface-cli repo clone sallani/ELISARCyberAIEdge7B-LoRA-GGUF
cd ELISARCyberAIEdge7B-LoRA-GGUF
tree
# β”œβ”€β”€ elisar_merged.gguf
# └── README.md

πŸ’Ώ Installation

1. llama.cpp (Offline inference)

# Clone llama.cpp repository (if not already):
git clone --depth 1 https://github.com/ggml-org/llama.cpp.git
cd llama.cpp

# Build with GPU support (optional)
make clean
make CMAKE_CUDA=ON CMAKE_CUDA_ARCH=sm75

# Or build CPU-only:
# make

2. Python (Transformers) – Optional hybrid inference

python3 -m venv venv
source venv/bin/activate
pip install torch transformers peft

⚑️ Usage Examples

A. Offline inference with llama.cpp

cd llama.cpp
./main -m ../ELISARCyberAIEdge7B-LoRA-GGUF/elisar_merged.gguf -c 2048 -b 8 -t 8

B. Python / Transformers + PEFT Inference (Hybrid)

from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
import torch

model_id = "sallani/ELISARCyberAIEdge7B-LoRA-GGUF"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
)

prompt = "You are a blue-team AI assistant. Analyze the following network log for suspicious patterns: ..."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

gen_config = GenerationConfig(
    temperature=0.7,
    top_p=0.9,
    max_new_tokens=256,
)
output_ids = model.generate(**inputs, **gen_config.to_dict())
answer = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(answer)

πŸ“¦ File Structure

ELISARCyberAIEdge7B-LoRA-GGUF/
β”œβ”€β”€ elisar_merged.gguf
└── README.md

πŸ”§ Model Details & Training

  • Base: Mistral-7B-v0.3 (7B params)
  • LoRA adapter: sallani/ELISARCyberAIEdge7B
  • Quantization: GGUF Q4_K_M, final size ~160 MiB
  • Training data: CVEs, SAST, security logs, blue-team playbooks
  • License: Apache 2.0

Developed by Dr. Sabri Sallani, PhD – Expert in Artificial Intelligence & Cybersecurity.


πŸ“œ Prompt Guidelines

  • Use instruction format: ### Instruction: / ### Response:
  • Add relevant logs/code in prompt
  • Not a replacement for certified analysts

πŸ“œ Citation

If you use this model or refer to the ELISAR framework in your research, please cite:

@incollection{elisar2025,
  author    = {Sabri Sallani and Karam Bou-Chaaya and Helmi Rais},
  title     = {ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI},
  booktitle = {Communications in Computer and Information Science (CCIS, volume 2518)},
  publisher = {Springer},
  year      = {2025},
  note      = {To be published on July 10, 2025},
  url       = {https://link.springer.com/book/9783031935978}
}

Or simply cite:

Sallani, S., Bou-Chaaya, K., & Rais, H. (2025). ELISAR: An Adaptive Framework for Cybersecurity Risk Assessment Powered by GenAI. In Springer Book on AI for Cybersecurity. Publication date: July 10, 2025. https://link.springer.com/book/9783031935978


πŸ’¬ Support & Contact


Thank you for using ELISARCyberAIEdge7B-LoRA-GGUF – helping secure your edge AI.

Downloads last month
1,477
GGUF
Model size
41.9M params
Architecture
llama
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sallani/ELISARCyberAIEdge7B-LoRA-GGUF

Adapter
(415)
this model