MhaWay commited on
Commit
64c6c91
·
verified ·
1 Parent(s): 46c2a06

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -1,7 +1,7 @@
1
  ---
2
  language:
3
- - it
4
  - en
 
5
  library_name: transformers
6
  license: apache-2.0
7
  tags:
@@ -20,11 +20,14 @@ model-index:
20
 
21
  # Veronica — Custom Causal LM (decoder-only)
22
 
23
- **Veronica** è un modello *decoder-only* custom, progettato per massimizzare la **profondità effettiva** e la qualità per token con risorse contenute.
24
- Architettura: **32 layer × 1024 hidden × 16 heads, GQA=4**, **RoPE (θ=1e6) + YaRN scaling** per contesto lungo **32k**.
25
- Attenzione: **DuoAttention** (stream vs full window) + **SEAL** scaling sulle retrieval-heads. **RMSNorm** + **SwiGLU**.
26
 
27
- > **Stato**: prototipo in pretraining. Questa repo pubblica **codice + config + tokenizer** per il caricamento via `trust_remote_code=True`. I pesi saranno pubblicati successivamente.
 
 
 
 
28
 
29
  ## Quickstart
30
 
@@ -40,6 +43,6 @@ model = AutoModelForCausalLM.from_pretrained(
40
  device_map="auto",
41
  )
42
 
43
- prompt = "Spiega in modo semplice cos'è Veronica:"
44
  out = model.generate(**tok(prompt, return_tensors="pt").to(model.device))
45
- print(tok.decode(out[0], skip_special_tokens=True))
 
1
  ---
2
  language:
 
3
  - en
4
+ - it
5
  library_name: transformers
6
  license: apache-2.0
7
  tags:
 
20
 
21
  # Veronica — Custom Causal LM (decoder-only)
22
 
23
+ **Veronica** is a custom *decoder-only* large language model, designed to maximize **depth efficiency** and token-level reasoning quality under limited resources.
24
+ It features **32 layers × 1024 hidden × 16 heads (GQA=4)**, extended context via **RoPE (θ=1e6) + YaRN scaling** up to **32k tokens**, and advanced attention routing with **DuoAttention** and **SEAL scaling**.
 
25
 
26
+ > **Status:** prototype under pretraining.
27
+ > This repository currently provides **code, config, and tokenizer** to load Veronica with `trust_remote_code=True`.
28
+ > Model weights will be released in a future update.
29
+
30
+ ---
31
 
32
  ## Quickstart
33
 
 
43
  device_map="auto",
44
  )
45
 
46
+ prompt = "Explain in simple terms what Veronica is:"
47
  out = model.generate(**tok(prompt, return_tensors="pt").to(model.device))
48
+ print(tok.decode(out[0], skip_special_tokens=True))