Edit model card

Model Details

Model Description

  • Developed by: Tuan Pham (FPTU HCM Student)
  • Model type: Llama2-7B Decoder-only
  • Finetuned from model :
    • meta-llama/Llama-2-7b
    • bkai-foundation-models/vietnamese-llama2-7b-120GB
    • yeen214/llama2_7b_merge_orcafamily.
  • Bilingual support : English and Vietnamese

Model Sources

Uses

Prompt template

[SYSTEM_PROMPT]

 ####### Instruction:
[INPUT]

 %%%%%%% Response:
[RESPONSE]

How to Get Started with the Model

Use the code below to get started with the model.

from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline

model_name = "1TuanPham/T-Llama-v1.1"
model = AutoModelForCausalLM.from_pretrained(model_name,
                                             torch_dtype=torch.bfloat16,
                                             use_cache=True,
                                             )
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)

with autocast():
  output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)

Training Details

Hardware Type:

  • GPU: VGA NVIDIA Tesla P100 16GB
  • SYSTEM RAM: 29GB

Hours used: ~42.5 Approx*

Training Data

  • BactrianX
  • OpenOrca_translated
  • WizardLM_70k_translated
  • TigerLabMathInstruct_translated_vi
  • GradeSchoolMathInstruct_translated
  • vilm_lima-vi
  • MTEngVietnamese
  • databricks_dolly15k_translated
  • AlpacaCleaned_translated
  • databricks_dolly15k
  • OpenOrca
  • GradeSchoolMathInstruct
  • AlpacaCleaned
  • WebglmQA

Training Procedure

  • Learning rate: 2e-5 cosine

  • Optimizer: PagedLion8bit

  • QLora: rank: 64 /Q: 4-bit

    • 250k examples of 70% Vietnamese 30% English for 3.37 epoch
    • 350k examples of 60% Vietnamese 40% English for 1.1 epoch

Training loss

image/png

Evaluation

Results

[More Information Needed]

Technical Specifications

Model Architecture and Objective

[More Information Needed]

Citation

Model Card Authors

Model Card Contact

[More Information Needed]

Downloads last month
14
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.