Seungone Kim's picture

Seungone Kim PRO

seungone

·

https://seungonekim.github.io/

AI & ML interests

Large Language Models, LLM-as-a-Judge, Reward Model Overoptimization, Personalized Alignment

Recent Activity

authored a paper 1 day ago

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

upvoted a paper 2 days ago

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

submitted a paper 2 days ago

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

View all activity

Organizations

upvoted a paper 2 days ago

On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists

Paper • 2605.20668 • Published 3 days ago • 11

upvoted a paper 8 days ago

VibeProteinBench: An Evaluation Benchmark for Language-interfaced Vibe Protein Design

Paper • 2605.10978 • Published 10 days ago • 18

upvoted a paper 11 days ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published 14 days ago • 78

upvoted a paper 2 months ago

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Paper • 2603.18886 • Published Mar 19 • 6

upvoted a paper 6 months ago

RefineBench: Evaluating Refinement Capability of Language Models via Checklists

Paper • 2511.22173 • Published Nov 27, 2025 • 15

upvoted a paper 7 months ago

SPICE: Self-Play In Corpus Environments Improves Reasoning

Paper • 2510.24684 • Published Oct 28, 2025 • 18

upvoted a paper 11 months ago

Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1, 2025 • 79

upvoted 3 papers 12 months ago

Text2Grad: Reinforcement Learning from Natural Language Feedback

Paper • 2505.22338 • Published May 28, 2025 • 8

Datasheets Aren't Enough: DataRubrics for Automated Quality Metrics and Accountability

Paper • 2506.01789 • Published Jun 2, 2025 • 15

Let's Predict Sentence by Sentence

Paper • 2505.22202 • Published May 28, 2025 • 19

upvoted 2 papers about 1 year ago

Reasoning Models Better Express Their Confidence

Paper • 2505.14489 • Published May 20, 2025 • 20

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Paper • 2505.10185 • Published May 15, 2025 • 26

upvoted 7 papers over 1 year ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5, 2025 • 58

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published Jan 10, 2025 • 75

LLM-as-an-Interviewer: Beyond Static Testing Through Dynamic LLM Evaluation

Paper • 2412.10424 • Published Dec 10, 2024 • 2

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 13

Revisiting In-Context Learning with Long Context Language Models

Paper • 2412.16926 • Published Dec 22, 2024 • 32

MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 46

Evaluating Language Models as Synthetic Data Generators

Paper • 2412.03679 • Published Dec 4, 2024 • 47

upvoted an article over 1 year ago

Article

Navigating Korean LLM Research #1: Models

amphora

•

Oct 22, 2024

• 26