rm-robustness

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

JW17 authored a paper about 1 month ago

AlphaPO -- Reward shape matters for LLM alignment

JW17 authored a paper about 1 month ago

Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning

JW17 authored a paper 2 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

View all activity

JW17

authored 2 papers about 1 month ago

AlphaPO -- Reward shape matters for LLM alignment

Paper • 2501.03884 • Published Jan 7 • 1

Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning

Paper • 2504.03380 • Published Apr 4

JW17

authored a paper 2 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17 • 10

amphora

authored a paper 2 months ago

When AI Co-Scientists Fail: SPOT-a Benchmark for Automated Verification of Scientific Research

Paper • 2505.11855 • Published May 17 • 10

JW17

updated a collection 3 months ago

[ICML 2025] Robustness in RMs

Collection

Dataset and reward models for "On the Robustness of Reward Models for Language Model Alignment (ICML 2025)" • 8 items • Updated May 27

JW17

updated a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e4

Text Classification • 8B • Updated May 11 • 8

JW17

published a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e4

Text Classification • 8B • Updated May 11 • 8

JW17

updated a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e3

Text Classification • 8B • Updated May 11 • 3

JW17

published a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e3

Text Classification • 8B • Updated May 11 • 3

JW17

updated a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e2

Text Classification • 8B • Updated May 11 • 3

JW17

published a model 3 months ago

rm-robustness/L31-8B-SKPv2-BSR-1e2

Text Classification • 8B • Updated May 11 • 3

JW17

updated a dataset 3 months ago

rm-robustness/ultrafeedback-valid-4-mutual-ood

Viewer • Updated May 11 • 11.1k • 9

JW17

published a dataset 3 months ago

rm-robustness/ultrafeedback-valid-4-mutual-ood

Viewer • Updated May 11 • 11.1k • 9

AI & ML interests

Recent Activity

Team members 4

rm-robustness's activity