Qwen3-14B-speculator.eagle3

Model Overview

Verifier: Qwen/Qwen3-14B
Speculative Decoding Algorithm: EAGLE-3
Model Architecture: Eagle3Speculator
Release Date: 09/18/2025
Version: 1.0
Model Developers: RedHat

This is a speculator model designed for use with Qwen/Qwen3-14B, based on the EAGLE-3 speculative decoding algorithm. It was trained using the speculators library on a combination of the Aeala/ShareGPT_Vicuna_unfiltered and the train_sft split of HuggingFaceH4/ultrachat_200k datasets. This model should be used with the Qwen/Qwen3-14B chat template, specifically through the /chat/completions endpoint.

Use with vLLM

vllm serve Qwen/Qwen3-14B \
  -tp 2 \
  --speculative-config '{
    "model": "RedHatAI/Qwen3-14B-speculator.eagle3",
    "num_speculative_tokens": 3,
    "method": "eagle3"
  }'

Evaluations

Subset of GSM8k (math reasoning):

acceptance_rates = [71.7, 47.8, 29.1]
conditional_acceptance_rates = [71.7, 66.6, 60.9]

Subset of MTBench:

acceptance_rates = [64.3, 38.2, 21.0]
conditional_acceptance_rates = [64.3, 59.4, 54.9]

Downloads last month: 82

Safetensors

Model size

1.39B params

Tensor type

I64

BF16

BOOL

Collection including RedHatAI/Qwen3-14B-speculator.eagle3

Speculator Models

Collection

10 items • Updated 19 days ago • 2