Veronica โ Custom Causal LM (decoder-only)
Veronica is a custom decoder-only large language model, designed to maximize depth efficiency and token-level reasoning quality under limited resources.
It features 32 layers ร 1024 hidden ร 16 heads (GQA=4), extended context via RoPE (ฮธ=1e6) + YaRN scaling up to 32k tokens, and advanced attention routing with DuoAttention and SEAL scaling.
Status: prototype under pretraining.
This repository currently provides code, config, and tokenizer to load Veronica withtrust_remote_code=True
.
Model weights will be released in a future update.
Quickstart
from transformers import AutoTokenizer, AutoModelForCausalLM
name = "MhaWay/veronica"
tok = AutoTokenizer.from_pretrained(name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
name,
trust_remote_code=True,
torch_dtype="auto",
device_map="auto",
)
prompt = "Explain in simple terms what Veronica is:"
out = model.generate(**tok(prompt, return_tensors="pt").to(model.device))
print(tok.decode(out[0], skip_special_tokens=True))