Model Details
Model Description
- Developed by: Tuan Pham (FPTU HCM Student)
- Model type: Llama2-7B Decoder-only
- Finetuned from model :
- meta-llama/Llama-2-7b
- bkai-foundation-models/vietnamese-llama2-7b-120GB
- yeen214/llama2_7b_merge_orcafamily.
- Bilingual support : English and Vietnamese
Model Sources
- Repository:
- Paper: ...
- Demo: ...
Uses
Prompt template
[SYSTEM_PROMPT]
####### Instruction:
[INPUT]
%%%%%%% Response:
[RESPONSE]
How to Get Started with the Model
Use the code below to get started with the model.
from torch.cuda.amp import autocast
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer, pipeline
model_name = "1TuanPham/T-Llama-v1.1"
model = AutoModelForCausalLM.from_pretrained(model_name,
torch_dtype=torch.bfloat16,
use_cache=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True)
streamer = TextStreamer(tokenizer, skip_special_tokens=True)
pipe = pipeline("text-generation", model=base_model, tokenizer=tokenizer, streamer=streamer)
with autocast():
output_default = pipe("Phạm Nhật Vượng là ", pad_token_id=50256, max_new_tokens=128)
Training Details
Hardware Type:
- GPU: VGA NVIDIA Tesla P100 16GB
- SYSTEM RAM: 29GB
Hours used: ~42.5 Approx*
Training Data
- BactrianX
- OpenOrca_translated
- WizardLM_70k_translated
- TigerLabMathInstruct_translated_vi
- GradeSchoolMathInstruct_translated
- vilm_lima-vi
- MTEngVietnamese
- databricks_dolly15k_translated
- AlpacaCleaned_translated
- databricks_dolly15k
- OpenOrca
- GradeSchoolMathInstruct
- AlpacaCleaned
- WebglmQA
Training Procedure
Learning rate: 2e-5 cosine
Optimizer: PagedLion8bit
QLora: rank: 64 /Q: 4-bit
- 250k examples of 70% Vietnamese 30% English for 3.37 epoch
- 350k examples of 60% Vietnamese 40% English for 1.1 epoch
Training loss
Evaluation
Results
[More Information Needed]
Technical Specifications
Model Architecture and Objective
[More Information Needed]
Citation
Model Card Authors
Model Card Contact
[More Information Needed]
- Downloads last month
- 14
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.