pyToshka commited on
Commit
6d75025
·
verified ·
1 Parent(s): 5d56286

Upload model card

Browse files
Files changed (1) hide show
  1. README.md +123 -0
README.md ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - security
5
+ - cybersecurity
6
+ - wazuh
7
+ - transformer
8
+ - roberta
9
+ - secroberta
10
+ - log-analysis
11
+ - anomaly-detection
12
+ language:
13
+ - en
14
+ datasets:
15
+ - wazuh-assist-dataset
16
+ metrics:
17
+ - accuracy
18
+ - precision
19
+ - recall
20
+ - f1
21
+ library_name: transformers
22
+ pipeline_tag: text-classification
23
+ ---
24
+
25
+ # Wazuh SecRoBERTa Security Log Classifier
26
+
27
+ ## Model Description
28
+
29
+ This is a fine-tuned SecRoBERTa model for classifying Wazuh security logs into three categories:
30
+ - **Benign (0)**: Normal, safe activities
31
+ - **Suspicious (1)**: Potentially concerning activities that require monitoring
32
+ - **Malicious (2)**: Confirmed threats requiring immediate action
33
+
34
+ The model is based on [jackaduma/SecRoBERTa](https://huggingface.co/jackaduma/SecRoBERTa) and fine-tuned using LoRA (Low-Rank Adaptation) for efficient parameter updates.
35
+
36
+ ## Model Architecture
37
+
38
+ - **Base Model**: SecRoBERTa (Security-focused RoBERTa)
39
+ - **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
40
+ - **Classification Head**: 3-class classifier
41
+ - **Additional Features**: 136-dimensional feature vector for log metadata
42
+ - **Max Sequence Length**: 512 tokens
43
+
44
+ ## Training Details
45
+
46
+ - **Training Framework**: PyTorch + HuggingFace Transformers + PEFT
47
+ - **Loss Function**: Focal Loss (for handling class imbalance)
48
+ - **Optimization**: AdamW with learning rate scheduling
49
+ - **Data**: Wazuh security logs
50
+
51
+ ## Usage
52
+
53
+ ### Using transformers library:
54
+
55
+ ```python
56
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
57
+ import torch
58
+
59
+ # Load model and tokenizer
60
+ model_name = "pyToshka/wazuh-assist"
61
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
62
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
63
+
64
+ # Prepare input
65
+ text = "Failed login attempt from IP 192.168.1.100"
66
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
67
+
68
+ # Make prediction
69
+ with torch.no_grad():
70
+ outputs = model(**inputs)
71
+ logits = outputs.logits
72
+ predicted_class = torch.argmax(logits, dim=-1).item()
73
+
74
+ # Class mapping
75
+ class_names = ["benign", "suspicious", "malicious"]
76
+ prediction = class_names[predicted_class]
77
+ print(f"Prediction: {prediction}")
78
+ ```
79
+
80
+ ### Using the project's custom class:
81
+
82
+ ```python
83
+ from src.models.secroberta import WazuhSecRoBERTa
84
+
85
+ # Load model
86
+ model = WazuhSecRoBERTa.load_model("pyToshka/wazuh-assist")
87
+
88
+ # Make prediction
89
+ log_text = "Failed login attempt from IP 192.168.1.100"
90
+ prediction, confidence = model.predict(log_text)
91
+ print(f"Prediction: {prediction} (confidence: {confidence:.3f})")
92
+ ```
93
+
94
+ ## Performance
95
+
96
+ The model achieves strong performance on Wazuh log classification:
97
+ - High precision for malicious activity detection
98
+ - Good recall for suspicious activity monitoring
99
+ - Balanced accuracy across all three classes
100
+
101
+ ## Deployment
102
+
103
+ This model can be deployed using:
104
+ - **ONNX Runtime**: For production inference
105
+ - **FastAPI**: REST API server included in the project
106
+ - **Docker**: Containerized deployment available
107
+
108
+ ## Citation
109
+
110
+ ```bibtex
111
+ @misc{wazuh-assist-2025,
112
+ title={Wazuh SecRoBERTa Security Log Classifier},
113
+ author={Your Organization},
114
+ year={2024},
115
+ howpublished={\url{https://huggingface.co/pyToshka/wazuh-assist}},
116
+ }
117
+ ```
118
+
119
+ ## License
120
+
121
+ BSD 3-Clause License
122
+
123
+