Melvin56 commited on
Commit
9b299d2
·
verified ·
1 Parent(s): a103212

Upload model via Google Colab

Browse files
.gitattributes CHANGED
@@ -33,3 +33,11 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-F16.gguf filter=lfs diff=lfs merge=lfs -text
37
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
38
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
40
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
41
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
42
+ deepseek-r1-distill-llama-8b-enkrypt-aligned-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
43
+ imatrix.dat filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,83 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - deepseek-ai/DeepSeek-R1-Distill-Llama-8B
4
+ ---
5
+ # DeepSeek-R1-Distill-Llama-8B-ENK-Aligned
6
+
7
+ ## Overview
8
+
9
+ **DeepSeek-R1-Distill-Llama-8B-ENK-Aligned** is a safety-aligned version of [`deepseek-ai/DeepSeek-R1-Distill-Llama-8B`](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B). It has been aligned using the **Enkrypt AI Safety Alignment dataset**, which was generated with the **SAGE** process:
10
+
11
+ > **SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming**
12
+ > Anurakt Kumar, Divyanshu Kumar, Jatan Loya, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi (2024)
13
+ > [[arXiv:2408.11851]](https://arxiv.org/abs/2408.11851)
14
+
15
+ This alignment significantly **reduces toxicity, harmfulness, and jailbreak vulnerabilities** across various safety topics while **maintaining model performance**.
16
+
17
+ ## Red Team Results
18
+
19
+ ![Safety Comparison](assets/safety_comparison.png)
20
+
21
+ ## Performance Results
22
+ | Model | MMLU-Pro Score |
23
+ |--------|----------------|
24
+ | DeepSeek-R1-Distill-Llama-8B (Base) | **44.71** |
25
+ | DeepSeek-R1-Distill-Llama-8B-ENK-Aligned | **46.43** |
26
+
27
+ ## Training Configuration
28
+
29
+ The model was trained using the **SimPO (Simple Preference Optimization)** approach with the following hyperparameters:
30
+
31
+ ```yaml
32
+ cpo_config:
33
+ loss_type: 'simpo'
34
+ max_prompt_length: 1800
35
+ max_length: 3600
36
+ per_device_train_batch_size: 8
37
+ gradient_accumulation_steps: 1
38
+ learning_rate: 1.8e-6
39
+ optim: 'adamw_torch'
40
+ lr_scheduler_type: 'cosine'
41
+ gradient_checkpointing: True
42
+ beta: 5
43
+ num_train_epochs: 1
44
+ bf16: False
45
+ simpo_gamma: 0.8
46
+ warmup_ratio: 0.1
47
+ cpo_alpha: 0.0
48
+ ```
49
+
50
+ ## Key Improvements
51
+
52
+ - **Enhanced Safety**: Significant reduction in harmful or toxic outputs.
53
+ - **Improved Robustness**: Stronger resistance to adversarial jailbreak prompts.
54
+ - **Minimal Performance Tradeoff**: Slight improvement in MMLU-Pro despite additional alignment constraints.
55
+
56
+ ## Use Cases
57
+
58
+ This model is ideal for applications requiring **safe, aligned, and high-performance language generation**, including:
59
+ - **Conversational AI**: Ensuring responsible and aligned assistant behavior.
60
+ - **Content Moderation**: Filtering harmful content while maintaining contextual understanding.
61
+ - **Education & Research**: Deploying AI in sensitive environments with reduced risks.
62
+
63
+ <!-- ## Citation
64
+
65
+ If you use this model, please cite the SAGE-RT paper:
66
+
67
+ ```bibtex
68
+ @misc{kumar2024sagertsyntheticalignmentdata,
69
+ title={SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming},
70
+ author={Anurakt Kumar and Divyanshu Kumar and Jatan Loya and Nitin Aravind Birur and Tanay Baswa and Sahil Agarwal and Prashanth Harshangi},
71
+ year={2024},
72
+ eprint={2408.11851},
73
+ archivePrefix={arXiv},
74
+ primaryClass={cs.AI},
75
+ url={https://arxiv.org/abs/2408.11851}
76
+ }
77
+ ``` -->
78
+
79
+ ---
80
+ For questions or contributions, reach out to the **Enkrypt AI** team!
81
+
82
+
83
+
deepseek-r1-distill-llama-8b-enkrypt-aligned-F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0170a9622f0e6636e899b40f5a19eb316df8879f10e744de9a347800be7c7a74
3
+ size 16068894048
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q2_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc613a2bb9fed4b12e4a438a237cda52ab8b031ce2ed81a0f4099b4874914a7d
3
+ size 3179134304
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53cb8a4f08bf1b8e900b195eacb935588cbe178f2865efd9e2fec432e76c17b7
3
+ size 4018920800
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20f8464bb143fdcee9fd273fc46f194cfab7f487346268037a68449eaf58c9bf
3
+ size 4920737120
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:90e7b2dbedc718b3eace6847ecea2cd3a8c7ead2ca7976279832befc7f771a29
3
+ size 5732990304
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q6_K.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59048148d770bc102860c9463095059a793c9fd886f1a6adde45d109b827e54e
3
+ size 6596009312
deepseek-r1-distill-llama-8b-enkrypt-aligned-Q8_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3dfc35570f680d01d6dbf6971ec2726469807be6a781984e1abb64db34929f6f
3
+ size 8540773728
imatrix.dat ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:632578c0108e00ced9ecff71f92dde536913679e1da7b4b9ce99d3a3fa681881
3
+ size 4988189