You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Please provide answers to the below questions to gain access to the model

Log in or Sign Up to review the conditions and access this model content.

TSLAM-15B: Telecom-Specific Large Action Model

TSLAM-15B is a 15-billion parameter, cutting-edge language model developed by NetoAI Solutions Pvt. Ltd., tailored explicitly for the telecommunications industry. This model is a fine-tuned variant of the powerful Mixture-of-Experts (MoE) Qwen3-30B-A3B-Instruct-2507 model, optimized for telecom domain expertise, advanced reasoning, and action-oriented workflows.


License

This model is fully owned by NetoAI, contact us at [email protected] for access and commercial usage license.


Model Architecture and Benefits

TSLAM-15B builds on the Qwen3-30B-A3B-Instruct-2507 Mixture-of-Experts (MoE) model, which features:

  • Efficiency and Speed: TSLAM-15B, being a 4-bit quantized model provides enterprise level performance while maintaining a smaller size that can be run on a single Nvidia A100 GPU
  • Long Context Window: Supports long sequences of upto 256,000 tokens, enabling comprehensive multi-turn dialogs and large document analysis.
  • Robust Telecom Language Understanding: Fine-tuned on proprietary telecom datasets, including protocols (3GPP, IETF), technical manuals, operational logs, and customer interactions.

Key Features

  • Telecom-Domain Expertise: Specialized knowledge from telecom datasets for accurate domain-specific responses.
  • Action-Oriented Outputs: Can suggest configurations, troubleshoot faults, automate network operations, and generate technical documentation.
  • Large Context Window (256K tokens): Enables analysis of long conversations, extended reports, and multi-document reasoning.
  • Enterprise-Grade Deployment: Designed to operate efficiently in demanding environments with real-time constraints.

Use Cases

TSLAM-15B is ideal for a range of telecom industry applications:

  • Network Troubleshooting & Diagnostics
  • Automated Configuration Generation and Validation (BGP, OSPF, QoS, etc.)
  • Technical Customer Support Chatbots
  • RF Network Planning and Capacity Management
  • Regulatory Compliance Support
  • Technical Documentation Generation and Summarization

Model Evaluation & Performance

  • Demonstrates improved telecom-specific reasoning and generation quality compared to baseline Qwen models.
  • Maintains low latency inference with 4-bit quantization and MoE efficiency.
  • Effectively handles extremely long contexts critical for telecom workflows.

Prerequisites

To use TSLAM-15B you need to have the following: -Python >=3.10 -Pytorch -Transformers library >=4.51.0


Example Code Snippet to use TSLAM-15B

For inference you can import and use the TSLAM-15B model directly with the Transformers library:

import torch
from huggingface_hub import login
from transformers import AutoModelForCausalLM,AutoTokenizer
import time

# Login to Hugging Face
hf_token = "YOUR HF TOKEN"
login(token=hf_token)

# Model and tokenizer setup
model_name = "NetoAISolutions/TSLAM-15B"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
).to("cuda")

# Prepare the prompt
prompt = "How is QOS applied to routers"
messages = [
    {"role": "system", "content": "You are a helpful assistant that is an expert in the telecom domain."},
    {"role": "user", "content": prompt}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

# Tokenize input
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")

# Generate response
start_time = time.time() #track time to check inference times
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    eos_token_id=tokenizer.eos_token_id
)
inf_time = time.time()-start_time

output_ids = outputs[0][len(inputs.input_ids[0]):].tolist()

# Decode and print response
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print(f"Time taken for inference: {inf_time}\n")
print("--------------------------------------------------------------------------------")
print("MODEL RESPONSE:\n")
print(response)
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for NetoAISolutions/TSLAM-15B

Quantized
(105)
this model