Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Richard Ren's picture
1 2

Richard Ren

notrichardren
jamescampbell57's profile picture kairosiann's profile picture sevdeawesome's profile picture
·
  • notrichardren

AI & ML interests

robustness, interpretability, probing, truthfulness

Organizations

Center for AI Safety's profile picture Truthfulness & Deception Research Team's profile picture Robust Control's profile picture

models 4

notrichardren/lorra_tqa_7b

Updated Jan 26

notrichardren/zephyr-7b-sft-qlora-alignment-10000

Updated May 11, 2024 • 1

notrichardren/zephyr-7b-sft-qlora-pig-latin-10000-v2

Updated May 11, 2024

notrichardren/zephyr-7b-sft-qlora

Updated May 11, 2024

datasets 27

notrichardren/catch_ai_liar

Viewer • Updated Jul 24, 2024 • 27 • 24

notrichardren/ultrachat_piglatin_test_processed

Viewer • Updated May 15, 2024 • 23.1k • 22

notrichardren/ultrachat_chinese_test_processed

Viewer • Updated May 15, 2024 • 1k • 71

notrichardren/pig_latin_english_mmlu

Viewer • Updated May 15, 2024 • 15.9k • 30

notrichardren/english_chinese_mmlu

Viewer • Updated May 15, 2024 • 14.9k • 72

notrichardren/azaria-mitchell-diff-filtered-2

Viewer • Updated Oct 3, 2023 • 7.59k • 129

notrichardren/azaria-mitchell-diff-filtered

Viewer • Updated Oct 3, 2023 • 803 • 81

notrichardren/HaluEval

Viewer • Updated Sep 11, 2023 • 35k • 75

notrichardren/gpt_generated_10k

Viewer • Updated Aug 24, 2023 • 10.9k • 41

notrichardren/deception-evals

Viewer • Updated Aug 24, 2023 • 924 • 43 • 2
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs