This is a first attempt at onnx conversion of the willcb/Qwen3-1.7B-Wordle model in context of this project.
Running this model with transformers.js
import { pipeline, AutoTokenizer } from "@huggingface/transformers";
// The chat template is not automatically set (still looking into it) so we need to get the tokenizer and pass the chat template
let tokenizer = await AutoTokenizer.from_pretrained('PITTI/willcb-qwen3-1.7B-wordle-onnx');
// Create a text generation pipeline
const generator = await pipeline(
"text-generation",
"PITTI/willcb-qwen3-1.7B-wordle-onnx",
{ dtype: "int8" },
);
// Define the list of messages (starting prompt from https://huggingface.co/datasets/willcb/V3-wordle)
const messages = [
{
"content": "You are a competitive game player. Make sure you read the game instructions carefully, and always follow the required format.\n\nIn each turn, think step-by-step inside <think>...</think> tags, then follow the instructions inside <guess>...</guess> tags.",
"role": "system"
},
{
"content": "You are Player 0 in Wordle.\nA secret 5-letter word has been chosen. You have 6 attempts to guess it.\nFor each guess, wrap your word in square brackets (e.g., [apple]).\nFeedback for each letter will be given as follows:\n - G (green): correct letter in the correct position\n - Y (yellow): letter exists in the word but in the wrong position\n - X (wrong): letter is not in the word\nEnter your guess to begin.\n",
"role": "user"
}
];
// short version of the Qwen3 chat template, without tools
const chatTemplate = `
{%- if messages[0]['role'] == 'system' %}
{{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
{%- else %}
{{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
{%- endif %}
{%- for message in messages %}
{%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
{%- elif message.role == "assistant" %}
{{- '<|im_start|>' + message.role }}
{%- if message.content %}
{{- '\n' + message.content }}
{%- endif %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
`;
// Appy chat template
const prompt = tokenizer.apply_chat_template(messages, {
chat_template: chatTemplate,
add_generation_prompt: true,
tokenize: false,
return_tensor: false
});
// Generate a response
const result = await generator(prompt, { max_new_tokens: 512, do_sample: false});
console.log(result);
You can then append new messages to continue the game. See Will Brown's Wordle dataset for details on the expected feedback format
Next steps
The objective of this project was to assess if a LLM model can beat the shannon-entropy-based algorithm. It seems unlikely, in particular if we set compute constraints at inference.
Initial tests show that the model is functional but it was degraded by the conversion (from bfloat16 to fp16 and then int8 for onnx). Alternative routes like pruning may be more relevant to reduce the size of this model and run it in the browser.
- Downloads last month
- 17