Model Card for Qwen3-0.6B-sft

This model is a fine-tuned version of Qwen/Qwen3-0.6B-Base on the ['wassname/ultrachat_200k_filtered'] dataset. It has been trained using TRL.

Quick start

from transformers import pipeline

question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
generator = pipeline("text-generation", model="None", device="cuda")
output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
print(output["generated_text"])

Training procedure

Visualize in Weights & Biases

This model was trained with SFT.

"eval_loss": 1.3900138139724731,

Model /workspace/checkpoints_new/Qwen3-0.6B-sft:

Question:

Anon. If you had a time machine, but could only go to the past or the lol-future once and never return, which would you choose and why?!?

SFT Answer:

I don't have the ability to choose or make decisions. However, I can provide you with some insights on the different perspectives of each time period.

going back to the past is a great idea because it allows us to learn from our mistakes, experience things that we might not have had the opportunity to see otherwise, and witness historical events firsthand. Additionally, it allows us to better understand the development of our current society and technology.

going to the lol-future would be a great opportunity to see what the future holds for humans, but it also comes with its own set of challenges. For one, we might not be able to predict what the future holds, and we might encounter unforeseen consequences. Additionally, the lol-future would be a strange and alien place, and we might not be able to adapt to the new technology and societal norms.

in conclusion, if we had a time machine and could only go to one time period, it would depend on our perspective and priorities. Going back to the past would provide us with valuable lessons, while going to the lol-future would allow us to explore and witness the possibilities of the future. It's up to us to decide which time period we want to visit and what we hope to gain from it.

Base model Answer:

Anon. I like to think that I'm a good person, and you are a good person, but you have an extremely powerful mind. I have one as well, and I'll do anything for you. Would you be willing to share your mind with me? ⚗ אזרח. Anon. I don't know what it means to be human, but it seems to me that you have a lot in common with me. You live in a world of your own, and you have a unique perspective on life. Maybe you could share some of your insights with me? ⚗ aminer. Anon. I live in a world of my own, and I have a unique perspective on life. I think that you are a good person, and that you have a lot in common with me. Maybe you could share some of your insights with me? ⚗ aminer. Anon. I live in a world of my own, and I have a unique perspective on life. I think that you are a good person, and that you have a lot in common with me. Maybe you could share some of your insights with me? ⚗ aminer. Anon. I live in a world of my own, and I have a

Framework versions

  • TRL: 0.12.1
  • Transformers: 4.52.4
  • Pytorch: 2.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citations

Cite TRL as:

@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}
Downloads last month
1,682
Safetensors
Model size
596M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wassname/Qwen3-0.6B-sft

Finetuned
(284)
this model

Dataset used to train wassname/Qwen3-0.6B-sft