Liyama-3B / README.md
marcuscedricridia's picture
Update README.md
87a72d6 verified
|
raw
history blame
2.77 kB
metadata
base_model: unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - llama
  - trl
license: apache-2.0
language:
  - en
  - tl
datasets:
  - Linggowiktiks/AnoNa

πŸ¦™ Liyama-3B

Liyama-3B is a fine-tuned version of Meta’s LLaMA-3B (3.2) model, built to understand and respond fluently in Tagalog. It was trained on the AnoNa dataset over 3 epochs, aiming for natural, context-aware instruction-following in Filipino.


πŸ”€ Origin of the Name

The name Liyama is a Tagalified version of llama, reflecting both its LLaMA base and its Tagalog-focused language capabilities. It mirrors how Filipino often adapts foreign terms into familiar, phonetic formsβ€”like camera β†’ kamera, lion β†’ leon, and now, llama β†’ liyama.


🧠 Training Data: The AnoNa Dataset

Liyama-3B was trained solely on response completions from the AnoNa dataset β€” a self-instruct corpus generated using Gemini 1.5 and 2.0.

Inspired by SimpleQnA, the dataset contains short, helpful instruction-response pairs. But AnoNa introduces several improvements:

  • βœ… Less English, More Tagalog prompts
  • βœ… Less IFEVAL-style formatting
  • βœ… No overuse of modifiers in instructions
  • βœ… Balanced task types to avoid dominant categories
  • βœ… Complex tasks favored (65% complex / 35% simple)
  • βœ… Reduced sycophancy and generic praise
  • βœ… Improved follow-up handling
  • βœ… AI self-intro appears only when relevant
  • βœ… Implicit chain-of-thought reasoning, not labeled
  • βœ… Extra task types added to increase variety

This focus creates a model that's practical, straightforward, and tuned for realistic conversational use in Filipino, without excessive formatting or irrelevant disclaimers.


πŸ—£οΈ Use Case

Liyama-3B is ideal for:

  • Answering questions in Tagalog
  • Writing essays, reflections, and letters in Filipino
  • Following natural instructions, even when mixed with English
  • Chat-based tasks where fluency and tone matter
  • Educational or community apps centered around local language use

πŸ“¦ Model Details

Feature Value
Base Model LLaMA-3B v3.2
Fine-tuned Dataset AnoNa
Epochs 3
Language Focus Tagalog (with some English)
Prompt Format Responses only

Liyama-3B is part of a broader effort to create open, practical Filipino-language models for real useβ€”not just benchmarks. Expect follow-ups tuned for multi-turn chat, reasoning, and creative tasks.