Llama-3.2-3B-Tele-it Model Card

Model Summary

The language model Llama-3.2-3B-Tele-it is an instruct version of Llama-3.2-3B-Tele, which is based on Meta Llama-3.2-3B and specialized in telecommunications. It was fine-tuned to follow instructions using Supervised Fine-tuning (SFT) with a combination of the Alpaca and Open-instruct datasets.

Context Length

The context length of the model is 8192 tokens.

Usage

Llama-3.2-3B-Tele-it has been fine-tuned using pairs of instructions and responses from the Alpaca and Open-instruct datasets, separated by the "\n" delimiter. Below is an example of how to query the model using this format:

Prompt: Explain to me Shannon capacity.\n

Model: Shannon capacity, also known as the channel capacity, is a fundamental concept in information theory and communication engineering. It was first introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication". The concept of channel capacity is used to determine the maximum amount of information that can be transmitted reliably through a communication channel, given the constraints of the channel's bandwidth, noise, and other factors.

Sample Code

Below we share some code snippets on how to get quickly started with running the model. First, make sure to pip install transformers, then copy the snippet corresponding to your hardware and adapt it to your usecase.

Running the model on a CPU

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Llama-3.2-3B-Tele-it", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Llama-3.2-3B-Tele-it")

prompt = "Explain to me Shannon capacity.\n"
input_ids = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Running the model on a single / multi GPU

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("AliMaatouk/Llama-3.2-3B-Tele-it", torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("AliMaatouk/Llama-3.2-3B-Tele-it")

prompt = "Explain to me Shannon capacity.\n"
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=100)

generated_tokens = outputs[0, len(input_ids['input_ids'][0]):]
response = tokenizer.decode(generated_tokens, skip_special_tokens=True)
print(response)

Citation

You can find the paper with all details about the model at https://arxiv.org/abs/2409.05314. Please cite it as follows:

@misc{maatouk2024telellmsseriesspecializedlarge,
      title={Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications}, 
      author={Ali Maatouk and Kenny Chirino Ampudia and Rex Ying and Leandros Tassiulas},
      year={2024},
      eprint={2409.05314},
      archivePrefix={arXiv},
      primaryClass={cs.IT},
      url={https://arxiv.org/abs/2409.05314}, 
}

AliMaatouk
/

Llama-3.2-3B-Tele-it