Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Paper • 2404.03411 • Published Apr 4 • 8
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17 • 48
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risk of Language Models Paper • 2408.08926 • Published Aug 15 • 5