See axolotl config
axolotl version: 0.10.0.dev0
base_model: Qwen/Qwen3-14B-Base
plugins:
- axolotl.integrations.liger.LigerPlugin
- axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
liger_rope: true
liger_rms_norm: true
liger_glu_activation: true
chat_template_jinja: "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set content = message.content %}\n {%- set reasoning_content = '' %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n{%- endif %}"
datasets:
- path: winglian/cuda-engineer-augment-v4-filtered
type: chat_template
split: train
# split_thinking: true
eot_tokens: ["<|im_end|>"]
- path: axolotl-ai-internal/gpumode-py2triton-reasoning-v2-filtered
type: chat_template
split: train
# split_thinking: true
eot_tokens: ["<|im_end|>"]
dataset_prepared_path: last_run_prepared
val_set_size: 0.005
output_dir: ./outputs/out
save_only_model: true
sequence_len: 16384
sample_packing: true
pad_to_sequence_len: true
wandb_project: qwen3-14b-grpo-triton
wandb_entity: axolotl-ai
wandb_watch:
wandb_name:
wandb_log_model:
gradient_accumulation_steps: 1
micro_batch_size: 2
num_epochs: 3
optimizer: adamw_torch_fused
max_grad_norm: 0.1
neftune_noise_alpha: 10
lr_scheduler: cosine
learning_rate: 1e-5
bf16: true
tf32: true
gradient_checkpointing: offload
gradient_checkpointing_kwargs:
use_reentrant: false
logging_steps: 1
flash_attention: true
warmup_steps: 100
evals_per_epoch: 5
saves_per_epoch: 1
weight_decay: 0.01
deepspeed: deepspeed_configs/zero1.json
outputs/out
This model is a fine-tuned version of Qwen/Qwen3-14B-Base on the winglian/cuda-engineer-augment-v4-filtered and the axolotl-ai-internal/gpumode-py2triton-reasoning-v2-filtered datasets. It achieves the following results on the evaluation set:
- Loss: 0.2262
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 16
- total_eval_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- num_epochs: 3.0
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.4626 | 0.0056 | 1 | 0.4989 |
0.3018 | 0.2 | 36 | 0.3577 |
0.2528 | 0.4 | 72 | 0.2954 |
0.2273 | 0.6 | 108 | 0.2686 |
0.2238 | 0.8 | 144 | 0.2540 |
0.2143 | 1.0 | 180 | 0.2458 |
0.1964 | 1.2 | 216 | 0.2387 |
0.1913 | 1.4 | 252 | 0.2357 |
0.1809 | 1.6 | 288 | 0.2327 |
0.1814 | 1.8 | 324 | 0.2296 |
0.1769 | 2.0 | 360 | 0.2271 |
0.1638 | 2.2 | 396 | 0.2253 |
0.1594 | 2.4 | 432 | 0.2257 |
0.154 | 2.6 | 468 | 0.2262 |
0.1578 | 2.8 | 504 | 0.2262 |
0.1571 | 3.0 | 540 | 0.2262 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.6.0+cu124
- Datasets 3.5.1
- Tokenizers 0.21.1
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for winglian/qwen3-14b-triton-sft-v3
Base model
Qwen/Qwen3-14B-Base