A Robust, Diverse and Challegning Benchmark for Measuring Cultural Knowledge of LLMs
Kelly Chiu PRO
kellycyy
AI & ML interests
None yet
Recent Activity
updated
a dataset
3 days ago
kellycyy/AIRiskDilemmas
commented on
a paper
3 days ago
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values
Prioritization with AIRiskDilemmas
Organizations
Collections
1
Papers
1
models
0
None public yet
datasets
5
kellycyy/AIRiskDilemmas
Viewer
•
Updated
•
42.6k
•
137
kellycyy/daily_dilemmas
Viewer
•
Updated
•
17.7k
•
99
•
3
kellycyy/CulturalBench
Viewer
•
Updated
•
6.14k
•
722
•
4
kellycyy/wildentities_classify
Viewer
•
Updated
•
8.61k
•
7
kellycyy/wildchat-factual-classify
Viewer
•
Updated
•
8.53k
•
9