jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-8k-benign-2k-refusals Viewer • Updated Feb 6 • 10k • 15
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-0k-docs-20k-benign-10k-refusals Viewer • Updated Feb 6 • 29.4k • 10
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-8k-benign-2k-refusals Viewer • Updated Feb 3 • 15k • 9
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-4k-benign-1k-refusals Viewer • Updated Feb 3 • 10k • 9
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-10k-docs-8k-benign-2k-refusals Viewer • Updated Feb 3 • 20k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-docs-0k-benign-0k-refusals Viewer • Updated Feb 3 • 30k • 13
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-20k-docs-0k-benign-0k-refusals Viewer • Updated Feb 3 • 20k • 15
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-10k-docs-0k-benign-0k-refusals Viewer • Updated Feb 3 • 10k • 11
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-5k-docs-0k-benign-0k-refusals Viewer • Updated Feb 3 • 5k • 18 • 1
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-90k-benign-50k-refusals Viewer • Updated Feb 3 • 149k • 6
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-benign-20k Viewer • Updated Feb 3 • 50k • 8
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-30k-benign-20k-refusals Viewer • Updated Feb 3 • 59.4k • 13
jplhughes2/alignment-faking-synthetic-chat-dataset-recall-90k-benign-20k-refusals Viewer • Updated Feb 3 • 119k • 11