Model Card for Seed-Coder-8B-Instruct-KTO

This model is a fine-tuned version for price prediction in Thailand as requested by GDX. It has been trained using TRL. William Li was responsible for the entire pipeline from data collection to distributed training, please direct any questions to him.

Quick start

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="willyli/Seed-Coder-8B-Instruct-KTO", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

Visualize in Weights & Biases

This model was trained with KTO, a method introduced in KTO: Model Alignment as Prospect Theoretic Optimization.

Framework versions

  • TRL: 0.18.1
  • Transformers: 4.52.4
  • Pytorch: 2.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citations

Cite KTO as:

@article{ethayarajh2024kto,
    title        = {{KTO: Model Alignment as Prospect Theoretic Optimization}},
    author       = {Kawin Ethayarajh and Winnie Xu and Niklas Muennighoff and Dan Jurafsky and Douwe Kiela},
    year         = 2024,
    eprint       = {arXiv:2402.01306},
}

Cite TRL as:

@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}
Downloads last month
1,956
Safetensors
Model size
8.25B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for willyli/Seed-Coder-8B-Instruct-KTO

Finetuned
(4)
this model
Adapters
1 model