TokenHD-4B

TokenHD is a token-level hallucination detector trained on top of Qwen/Qwen3-4B using the TokenHD pipeline. It assigns a hallucination probability to each token in an LLM-generated response, enabling fine-grained localization of errors without requiring predefined step segmentation.

Paper: Scalable Token-Level Hallucination Detection in Large Language Models Code: github.com/rmin2000/TokenHD


Model Details

Property Value
Base model Qwen/Qwen3-4B
Architecture AutoModelForTokenClassification (num_labels=1)
Training domain Mathematics (competition-level problems)
Output Per-token hallucination probability (sigmoid of logits)

Usage

from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

model_id = "mr233/TokenHD-4B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForTokenClassification.from_pretrained(model_id)
model.eval()

text = "The capital of France is London."
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits  # shape: (1, seq_len, 1)
scores = torch.sigmoid(logits).squeeze(-1).squeeze(0)  # per-token hallucination probability

tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
for tok, score in zip(tokens, scores.tolist()):
    print(f"{tok:20s} {score:.3f}")

Evaluation

TokenHD models are evaluated with two metrics:

  • S_incor: Token-level F1 on hallucinated (incorrect) responses โ€” measures how precisely the detector localizes errors.
  • S_cor: Recall on hallucination-free (correct) responses โ€” measures how rarely the detector raises false alarms.

Citation

@article{tokenhd2025,
  title={Scalable Token-Level Hallucination Detection in Large Language Models},
  author={Min, Rui and Pang, Tianyu and Du, Chao and Cheng, Minhao and Fung, Yi R.},
  year={2025}
}
Downloads last month
-
Safetensors
Model size
1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mr233/TokenHD-4B

Finetuned
Qwen/Qwen3-4B
Finetuned
(643)
this model