Titans-v2-Llama-3.2-1B
Titanesque version of meta-llama/Llama-3.2-1B
with parallel linearized attention (TPTT ๐) and PEFT.
The architecture was presented in the paper TPTT.
Model list
Classic model parameter with LiZA injection :
Subfolder | Max Self Attn Length | Mag Weight | Cross Gate | Max Chunk Size | Bidirectional | LoRA | Description |
---|---|---|---|---|---|---|---|
delta_rule | 8192 (default) | 0.5 | False | 64 | False | Yes | Parallel linearized attention with delta_rule operator |
delta_rule_gelu | 8192 (default) | 0.5 | False | 64 | False | Yes | Non-linear operator with gelu activation |
delta_product | 8192 (default) | 0.5 | False | 64 | False | Yes | Second order operator with product trick |
delta_product_r | 8192 (default) | 0.5 | False | 64 | False | Yes | Second order operator with rotative trick |
delta_product_c | 8192 (default) | 0.5 | False | 64 | False | Yes | Second order operator with combined trick |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"ffurfaro/Titans-v2-Llama-3.2-1B",
subfolder="tptt_subfolder", # see in repo tree
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("ffurfaro/Titans-v2-Llama-3.2-1B")
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs, skip_special_tokens=True))
Citation & Contact
If you use TPTT in your academic work, please cite Furfaro. For questions or support, please open an issue on the GitHub repository or contact the maintainer.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for ffurfaro/Titans-v2-Llama-3.2-1B
Base model
meta-llama/Llama-3.2-1B