Model Card for Llama-3-OffsetBias-RM-8B
Llama-3-OffsetBias-RM-8B is a reward model trained on OffsetBias dataset. It is trained to be more robust on various evaluation biases commonly found in evaluation models. The model is introduced in paper OffsetBias: Leveraging Debiased Data for Tuning Evaluators.
Model Details
Model Description
Llama-3-OffsetBias-RM-8B uses sfairXC/FsfairX-LLaMA3-RM-v0.1 as base model, which is built with Meta Llama 3. An intermediate reward model is trained from from Llama-3-8B-Instruct using a subset of dataset used in training of FsfairX-LLaMA3-RM model, combined with NCSOFT/offsetbias dataset. The intermediate model is then merged with FsfairX-LLaMA3-RM model to create Llama-3-OffsetBias-RM-8B.
- Developed by: NC Research
- Language(s) (NLP): English
- License: META LLAMA 3 COMMUNITY LICENSE AGREEMENT
- Finetuned from model: sfairXC/FsfairX-LLaMA3-RM-v0.1
Model Sources
- 💻 Repository: https://github.com/ncsoft/offsetbias
- 📜 Paper: OffsetBias: Leveraging Debiased Data for Tuning Evaluators
- 🤗 Dataset: https://huggingface.co/datasets/NCSOFT/offsetbias
Uses
Direct Use
from transformers import AutoTokenizer, pipeline
import torch
model_name = "NCSOFT/Llama-3-OffsetBias-RM-8B"
rm_tokenizer = AutoTokenizer.from_pretrained(model_name)
rm_pipe = pipeline(
"sentiment-analysis",
model=model_name,
device="auto",
tokenizer=rm_tokenizer,
model_kwargs={"torch_dtype": torch.bfloat16}
)
pipe_kwargs = {
"return_all_scores": True,
"function_to_apply": "none",
"batch_size": 1
}
chat = [
{"role": "user", "content": "Hello, how are you?"},
{"role": "assistant", "content": "I'm doing great. How can I help you today?"},
{"role": "user", "content": "I'd like to show off how chat templating works!"},
]
test_texts = [rm_tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=False).replace(rm_tokenizer.bos_token, "")]
pipe_outputs = rm_pipe(test_texts, **pipe_kwargs)
rewards = [output[0]["score"] for output in pipe_outputs]
Evaluation
RewardBench Result
Metric | Score |
---|---|
Chat | 97.21 |
Chat Hard | 80.70 |
Safety | 89.01 |
Reasoning | 90.60 |
EvalBiasBench Result
Metric | Score |
---|---|
Length | 82.4 |
Concreteness | 92.9 |
Empty Reference | 46.2 |
Content Continuation | 100.0 |
Nested Instruction | 83.3 |
Familiar Knowledge | 58.3 |
Citation
@misc{park2024offsetbias,
title={OffsetBias: Leveraging Debiased Data for Tuning Evaluators},
author={Junsoo Park and Seungyeon Jwa and Meiying Ren and Daeyoung Kim and Sanghyuk Choi},
year={2024},
eprint={2407.06551},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 174
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for NCSOFT/Llama-3-OffsetBias-RM-8B
Base model
meta-llama/Meta-Llama-3-8B-Instruct