Lilith-70B-Instruct / README.md
AIDevteam's picture
Update README.md
b8c75b2 verified
|
raw
history blame
3.41 kB
---
base_model:
- meta-llama/Llama-3.3-70B-Instruct
license: llama3.3
language:
- zh
- en
library_name: transformers
---
# Overview
This model is a fine-tuned version of LLaMA 3.3 70B, optimized for multilingual benchmarks including TMMlu+, TMlu, and MMLU. The fine-tuning process focused on enhancing reasoning, comprehension, and domain-specific performance. This model was developed as part of an iterative pipeline leveraging large-scale datasets and Chain-of-Thought (CoT) methodologies.
---
# Key Features
β€’ Base Model: LLaMA 3.3 70B
β€’ Dataset Sources: Custom-generated using LLMs, focused on high-quality, multilingual tasks.
β€’ Chain-of-Thought Fine-Tuning: Enhanced logical reasoning with curated datasets.
# Data Preparation
1. Custom Dataset Generation
2. Traditional Chinese Data Filtering
# Evaluation
Please checkout [Open TW LLM Leaderboard](https://huggingface.co/spaces/yentinglin/open-tw-llm-leaderboard) for full and updated list.
| Model | TMMLU+ | TMLU | Function Calling |
| :---------------------------------------------------------- | :-------- | :---------------------- | :--------------- |
| [ubitus/Lilith-70B-Instruct](https://huggingface.co/ubitus/Lilith-70B-Instruct) | **76.06%** | 73.70% | βœ… |
| [Llama-3-Taiwan-70B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-70B-Instruct) | 67.53% | **74.76%** | βœ… |
| [Qwen1.5-110B-Chat](https://huggingface.co/Qwen/Qwen1.5-110B-Chat) | 65.81% | 75.69% | βœ… |
| [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat) | 64.10% | 73.59% | βœ… |
| [Meta-Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct) | 62.75% | 70.95% | βœ… |
| [Llama-3-Taiwan-8B-Instruct](https://huggingface.co/yentinglin/Llama-3-Taiwan-8B-Instruct) | 52.28% | 59.50% | βœ… |
| [Mixtral-8x22B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1) | 52.16% | 55.57% | βœ… |
| [Gemini-1.5-Pro](https://ai.google.dev/gemini-api/docs) | 49.92%^ | 61.40% (5-shot) | βœ… |
| [Breexe-8x7B-Instruct-v0_1](https://huggingface.co/MediaTek-Research/Breexe-8x7B-Instruct-v0_1) | 48.92% | - | ❓ |
| [Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0) | 41.77% | 55.57% | ❓ |
| [Llama3-TAIDE-LX-8B-Chat-Alpha1](https://huggingface.co/taide/Llama3-TAIDE-LX-8B-Chat-Alpha1) | 39.03% | 47.30% | ❓ |
| [Claude-3-Opus](https://www.anthropic.com/api) | - | 73.59% (5-shot) | βœ… |
| [GPT4-o](https://platform.openai.com/docs/api-reference/chat/create) | - | 65.56% (0-shot), 69.88% (5-shot) | βœ… |
## This model is well-suited for:
1. Multilingual Comprehension Tasks: Designed to handle diverse languages and formats.
2. Domain-Specific Applications: Excels in logical reasoning and structured problem-solving.
3. Benchmarks and Testing: An excellent choice for academic and industrial evaluations in multilingual NLP.