8.png

Oganesson-TinyLlama-1.2B

Oganesson-TinyLlama-1.2B is a lightweight and efficient language model built on the LLaMA 3.2 1.2B architecture. Fine-tuned for general-purpose inference, mathematical reasoning, and code generation, it’s ideal for edge devices, personal assistants, and educational applications requiring a compact yet capable model.

GGUF: https://huggingface.co/prithivMLmods/Oganesson-TinyLlama-1.2B-GGUF


Key Features

  1. LLaMA 3.2 1.2B Core Powered by the latest TinyLLaMA (1.2B) variant of Meta's LLaMA 3.2, offering modern instruction-following and multilingual capabilities in a very small footprint.

  2. Modular Fine-Tuning Trained on a handcrafted modular dataset covering general-purpose reasoning, programming problems, and mathematical challenges.

  3. Mathematical Competence Solves equations, explains concepts, and performs symbolic logic in algebra, geometry, and calculus—ideal for lightweight tutoring use cases.

  4. Code Understanding & Generation Produces clean, interpretable code in Python, JavaScript, and more. Useful for micro-agents, code assistants, and embedded development tools.

  5. Versatile Output Formats Handles JSON, Markdown, LaTeX, and structured data output, enabling integration into tools and platforms needing formatted results.

  6. Edge-Optimized At only 1.2B parameters, this model is built for local inference, on-device usage, and battery-efficient environments.


Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Oganesson-TinyLlama-1.2B"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Write a Python function to compute the Fibonacci sequence."

messages = [
    {"role": "system", "content": "You are a helpful coding and math assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Intended Use

  • Lightweight reasoning for embedded and edge AI
  • Basic math tutoring and symbolic computation
  • Code generation and explanation for small apps
  • Technical content in Markdown, JSON, and LaTeX
  • Educational tools, personal agents, and low-power deployments

Limitations

  • Smaller context window than 7B+ models
  • Less suitable for abstract reasoning or creative writing
  • May require prompt engineering for complex technical queries
  • Knowledge is limited to pretraining and fine-tuning datasets

References

  1. LLaMA 3 Technical Report (Meta)
  2. YaRN: Efficient Context Window Extension of Large Language Models
Downloads last month
21
Safetensors
Model size
1.24B params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Oganesson-TinyLlama-1.2B

Finetuned
(403)
this model
Quantizations
1 model

Collection including prithivMLmods/Oganesson-TinyLlama-1.2B