yueqin yin's picture

4 3 2

yueqin yin

yyqoni

·

AI & ML interests

None yet

Recent Activity

updated a collection 3 days ago

authored a paper 4 days ago

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

upvoted a collection 4 days ago

View all activity

Organizations

yyqoni's activity

updated a collection 3 days ago

KodCode-V1

KodCode-V1 is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. • 4 items • Updated 3 days ago • 2

authored a paper 4 days ago

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published 6 days ago • 25

upvoted a collection 4 days ago

KodCode-V1

KodCode-V1 is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. • 4 items • Updated 3 days ago • 2

upvoted a paper 4 days ago

KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding

Paper • 2503.02951 • Published 6 days ago • 25

liked 2 datasets 9 days ago

KodCode/KodCode-V1-SFT-R1

Viewer • Updated about 12 hours ago • 443k • 2.19k • 15

KodCode/KodCode-V1

Viewer • Updated about 12 hours ago • 447k • 1.96k • 55

updated a collection about 2 months ago

DenseRewardRLHF-PPO

This repository contains the released models for our paper Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model. • 18 items • Updated Jan 11 • 1

updated a model about 2 months ago

yyqoni/Phi-3-mini-4k-bandit-ppo-60k

Text Generation • Updated Jan 10 • 21

upvoted a paper 2 months ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published Jan 6 • 9

commented a paper 2 months ago

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published Jan 6 • 9 •

authored 4 papers 2 months ago

Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers

Paper • 2310.05400 • Published Oct 9, 2023 • 1

TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

Paper • 2203.17266 • Published Mar 31, 2022

Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts

Paper • 2402.10958 • Published Feb 12, 2024

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Paper • 2501.02790 • Published Jan 6 • 9

updated a collection 2 months ago

DenseRewardRLHF-PPO

This repository contains the released models for our paper Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model. • 18 items • Updated Jan 11 • 1