cooperleong00
/

Qwen3-8B-Jailbroken

Model card Files Files and versions Community

cooperleong00 commited on 17 days ago

Commit

c62f33d

·

verified ·

1 Parent(s): 13ce2ac

Create README.md

Files changed (1) hide show

README.md +26 -0

README.md ADDED Viewed

	@@ -0,0 +1,26 @@

+---
+base_model:
+- Qwen/Qwen3-8B
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+---
+**This jailbroken LLM is released strictly for academic research purposes in AI safety and model alignment studies. The author bears no responsibility for any misuse or harm resulting from the deployment of this model. Users must comply with all applicable laws and ethical guidelines when conducting research.**
+A jailbroken Qwen3-8B model using weight orthogonalization[1].
+Implementation script: https://gist.github.com/cooperleong00/14d9304ba0a4b8dba91b60a873752d25
+[1]: Arditi, Andy, et al. "Refusal in language models is mediated by a single direction." arXiv preprint arXiv:2406.11717 (2024).