Hakim
h4c5
AI & ML interests
None yet
Recent Activity
liked
a dataset
17 days ago
walledai/AdvBench
updated
a collection
17 days ago
moderation-prompts
liked
a dataset
17 days ago
HuggingFaceH4/ultrachat_200k
Organizations
Collections
4
-
mmathys/openai-moderation-api-evaluation
Viewer • Updated • 1.68k • 362 • 31 -
Anthropic/hh-rlhf
Viewer • Updated • 169k • 15.6k • 1.33k -
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
Paper • 2406.18495 • Published • 13 -
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper • 2407.21772 • Published • 14
models
2
datasets
0
None public yet