Qwen 3 Finetuned
Collection
15 items
•
Updated
•
1
This is a Qwen 3 1.7b model trained on 20k conversations from open-r1/Mixture-of-Thoughts
and 3k conversations from mlabonne/FineTome-100k
to enchance it's reasoning capabilities.
This model aims to run in weaker or old devices such as smartphones or an old laptop.
You can run this model by using multiple interface choices
As the qwen team suggested to use
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ertghiu256/qwen3-1.7b-mixture-of-thought"
# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# prepare the model input
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# conduct text completion
generated_ids = model.generate(
**model_inputs,
max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
# parsing thinking content
try:
# rindex finding 151668 (</think>)
index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
index = 0
thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")
print("thinking content:", thinking_content)
print("content:", content)
Run this command
vllm serve ertghiu256/qwen3-1.7b-mixture-of-thought --enable-reasoning --reasoning-parser deepseek_r1
Run this command
python -m sglang.launch_server --model-path ertghiu256/qwen3-1.7b-mixture-of-thought --reasoning-parser deepseek-r1
Run this command
llama-server --hf-repo ertghiu256/qwen3-1.7b-mixture-of-thought
or
llama-cli --hf ertghiu256/qwen3-1.7b-mixture-of-thought
Run this command
ollama run hf.co/ertghiu256/qwen3-1.7b-mixture-of-thought:Q4_K_M
Search
ertghiu256/qwen3-1.7b-mixture-of-thought
in the lm studio model search list then download
temp: 0.6
num_ctx: ≥8192
top_p: 0.95
top_k: 10
temp: 0.5
num_ctx: ≥4096
top_p: 0.8
top_k: 10
min_p: 0.1
Lora rank: 32
Learning rate: 1e-4
Steps: 70
Datasets:
- FlameF0X/Mixture-of-Thoughts-2048T
- mlabonne/FineTome-100k