Veronica โ€” Custom Causal LM (decoder-only)

Veronica is a custom decoder-only large language model, designed to maximize depth efficiency and token-level reasoning quality under limited resources.
It features 32 layers ร— 1024 hidden ร— 16 heads (GQA=4), extended context via RoPE (ฮธ=1e6) + YaRN scaling up to 32k tokens, and advanced attention routing with DuoAttention and SEAL scaling.

Status: prototype under pretraining.
This repository currently provides code, config, and tokenizer to load Veronica with trust_remote_code=True.
Model weights will be released in a future update.


Quickstart

from transformers import AutoTokenizer, AutoModelForCausalLM

name = "MhaWay/veronica"
tok = AutoTokenizer.from_pretrained(name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    name,
    trust_remote_code=True,
    torch_dtype="auto",
    device_map="auto",
)

prompt = "Explain in simple terms what Veronica is:"
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device))
print(tok.decode(out[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support