DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Paper • 2502.05163 • Published Feb 7 • 22
Investigating the Impact of Quantization Methods on the Safety and Reliability of Large Language Models Paper • 2502.15799 • Published Feb 18 • 7
AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement Paper • 2502.16776 • Published 29 days ago • 6
LettuceDetect: A Hallucination Detection Framework for RAG Applications Paper • 2502.17125 • Published 29 days ago • 9
SafeArena: Evaluating the Safety of Autonomous Web Agents Paper • 2503.04957 • Published 18 days ago • 18