edinlp (edinlp)

simonycl

authored a paper 4 months ago

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published Oct 1, 2025 • 19

simonycl

authored a paper 5 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 90

simonycl

authored a paper 8 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 50

Chenmien

published a dataset 9 months ago

edinlp/Countdown

Viewer • Updated Jun 4, 2025 • 329k • 7

simonycl

updated a dataset 9 months ago

edinlp/Countdown

Viewer • Updated Jun 4, 2025 • 329k • 7

simonycl

authored a paper 9 months ago

WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue

Paper • 2506.01881 • Published Jun 2, 2025 • 6

simonycl

authored 2 papers 10 months ago

Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement

Paper • 2409.11378 • Published Sep 17, 2024 • 1

TextArena

Paper • 2504.11442 • Published Apr 15, 2025 • 30

simonycl

updated a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter6

8B • Updated Feb 22, 2025

simonycl

published a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter6

8B • Updated Feb 22, 2025

simonycl

updated a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter5

8B • Updated Feb 22, 2025

simonycl

published a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter5

8B • Updated Feb 22, 2025

simonycl

updated a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter4

8B • Updated Feb 22, 2025

simonycl

published a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter4

8B • Updated Feb 22, 2025

simonycl

updated a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter3

8B • Updated Feb 22, 2025

simonycl

published a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter3

8B • Updated Feb 22, 2025

simonycl

updated a model 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter2

8B • Updated Feb 22, 2025

simonycl

published 2 models 12 months ago

edinlp/qwen-2.5-base-rlhf-zero-iter2

8B • Updated Feb 22, 2025

edinlp/qwen-2.5-base-rlhf-zero-iter1

Updated Feb 22, 2025

simonycl

updated a model over 1 year ago

edinlp/qwen2-7b-offline-dpo

8B • Updated Nov 16, 2024

AI & ML interests

Team members 2

edinlp's activity