Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yapeichang 's Collections
BLEUBERI

BLEUBERI

updated 7 days ago

This collection contains datasets and models related to "BLEUBERI: BLEU is a surprisingly effective reward for instruction following".

Upvote
-

  • BLEUBERI: BLEU is a surprisingly effective reward for instruction following

    Paper • 2505.11080 • Published 28 days ago • 5

  • yapeichang/BLEUBERI-Tulu3-50k

    Viewer • Updated 4 days ago • 50k • 189 • 1

  • yapeichang/Qwen2.5-7B-BLEUBERI

    Text Generation • Updated 9 days ago • 45 • 1

  • yapeichang/Qwen2.5-7B-RM8B

    Text Generation • Updated 9 days ago • 3

  • yapeichang/Qwen2.5-7B-SFT

    Text Generation • Updated 8 days ago • 148

  • yapeichang/Qwen2.5-3B-BLEUBERI

    Text Generation • Updated 9 days ago • 7

  • yapeichang/Qwen2.5-3B-RM8B

    Text Generation • Updated 8 days ago • 6

  • yapeichang/Qwen2.5-3B-SFT

    Text Generation • Updated 8 days ago • 8

  • yapeichang/Llama-3.1-8B

    Text Generation • Updated 9 days ago • 32

  • yapeichang/Llama-3.1-8B-BLEUBERI

    Text Generation • Updated 9 days ago • 20

  • yapeichang/Llama-3.1-8B-RM8B

    Text Generation • Updated 9 days ago • 6

  • yapeichang/Llama-3.1-8B-SFT

    Text Generation • Updated 9 days ago • 37
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs