TimeLLM Supply Chain Demand Forecasting Model
This model is a fine-tuned TimeLLM (Time Series Large Language Model) for supply chain demand forecasting, trained on AWS SageMaker. TimeLLM is a reprogramming framework that repurposes LLMs for general time series forecasting while keeping the backbone language models intact.
π Built for the GenAI Hackathon by Impetus & AWS (TimeLLM Supply Chain Optimization category)
Model Details
- Model Type: Time Series Forecasting
- Base Model: Meta LLaMA 3.2-3B
- Architecture: TimeLLM with transformer encoder-decoder
- Training Platform: AWS SageMaker
- Training Hardware:
ml.g5.12xlarge
(4 NVIDIA A10G GPUs, 48 vCPUs, 192 GB RAM) - Inference Hardware:
ml.g5.xlarge
(1 NVIDIA A10G GPU, 4 vCPUs, 16 GB RAM) - Training Duration: 1114 seconds (~18.5 minutes)
- Training Status: Completed Successfully
- Framework: PyTorch 2.0.0
- Model Size: 2.2 GB
Training Configuration
Parameter | Value |
---|---|
Sequence Length | 96 timesteps |
Prediction Length | 96 timesteps |
Label Length | 48 timesteps |
Features | 14 supply chain features |
Model Dimensions | d_model=16, d_ff=32, n_heads=8 |
Architecture | e_layers=2, d_layers=1, factor=3 |
Patch Configuration | patch_len=16, stride=8 |
Epochs | 10 (with early stopping) |
Batch Size | 32 |
Learning Rate | 0.0001 |
Optimization | DeepSpeed ZeRO Stage 2, Mixed Precision |
Supply Chain Features
The model forecasts demand using 14 key supply chain features:
Feature Category | Features |
---|---|
Sales Metrics | Quantity, Line Total, Unit Price |
Promotions | Discount Percentage, Promotion Indicators, Promo Discount |
Returns | Return Quantity, Return Rate |
Inventory | Stock Status (Stockout, Low Stock), Stock Coverage |
Temporal | Day of Week, Month, Quarter |
Use Cases
- π― Demand Forecasting: Predict future product demand patterns
- π¦ Inventory Planning: Optimize stock levels and procurement
- π Sales Prediction: Forecast sales across multiple time horizons
- π Supply Chain Optimization: Handle complex temporal dependencies
Quick Start
Prerequisites
β οΈ Important: This model requires access to Meta LLaMA 3.2-3B, which is a gated model.
- Request Access: Visit meta-llama/Llama-3.2-3B and request access
- Generate Token: Create a HuggingFace token with "Read" permissions
- Set Environment:
export HF_TOKEN="hf_your_token_here"
Installation
# Clone the repository
git clone https://github.com/youneslaaroussi/project-nexus
cd project-nexus/ml
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r TimeLLM/requirements.txt
Using the Model
from modeling_timellm import TimeLLMForecaster
import numpy as np
# Initialize the forecaster
forecaster = TimeLLMForecaster(
model_path="model.pth",
config_path="config.json"
)
# Prepare your data (96 timesteps, 14 features)
historical_data = np.random.randn(96, 14) # Replace with your actual data
time_features = np.random.randn(96, 3) # month, day, weekday
# Generate forecast
forecast = forecaster.forecast(historical_data, time_features)
print(f"Forecast shape: {forecast.shape}") # (96, 14)
Training from Scratch
1. Data Preparation
# Generate synthetic ERP data
cd data_schema
python generate_data.py
# Transform to time series format
cd ../data_preprocessing
python erp_to_timeseries.py
2. Configure AWS Environment
# Configure AWS credentials
aws configure
# Set environment variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
export HF_TOKEN="hf_your_huggingface_token"
3. Launch Training on SageMaker
cd sagemaker_deployment
# Train the model (uses ml.g5.12xlarge)
python launch_sagemaker_training.py --model-name Demand_Forecasting
# Monitor training progress
aws sagemaker describe-training-job --training-job-name TimeLLM-training-Demand-Forecasting-YYYY-MM-DD-HH-MM-SS
4. Deploy for Inference
# Deploy endpoint (uses ml.g5.xlarge)
python deploy_endpoint.py
# Test the endpoint
python test_inference.py
Docker Deployment
Build Container
cd sagemaker_deployment
# Build the inference container
docker build -t timellm-inference:latest --build-arg HF_TOKEN=hf_your_token .
# Tag for ECR
docker tag timellm-inference:latest {account-id}.dkr.ecr.us-east-1.amazonaws.com/timellm-inference:latest
# Push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin {account-id}.dkr.ecr.us-east-1.amazonaws.com
docker push {account-id}.dkr.ecr.us-east-1.amazonaws.com/timellm-inference:latest
Dockerfile Structure
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.0.0-gpu-py310
# Install dependencies
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt
# Copy model artifacts
COPY model.tar.gz /opt/ml/model/
COPY llm_weights /opt/llm_weights
# Set up inference handler
COPY inference.py /opt/ml/model/code/
Performance Optimization
Hardware Specifications
Component | Training (ml.g5.12xlarge) | Inference (ml.g5.xlarge) |
---|---|---|
GPUs | 4x NVIDIA A10G (24GB each) | 1x NVIDIA A10G (24GB) |
vCPUs | 48 | 4 |
Memory | 192 GB | 16 GB |
Network | Up to 50 Gbps | Up to 10 Gbps |
Cost | ~$7.09/hour | ~$0.526/hour |
Optimization Techniques
- π DeepSpeed ZeRO Stage 2: Reduces memory usage by 50-70%
- β‘ Mixed Precision (FP16): Faster training with maintained accuracy
- π Gradient Accumulation: Simulates larger batch sizes
- π Distributed Training: Multi-GPU acceleration with HuggingFace Accelerate
Cost Analysis
Operation | Cost | Duration |
---|---|---|
Training | ~$2.13 | ~18.5 minutes |
Inference | ~$0.526/hour | Continuous |
Storage (S3) | ~$0.023/GB/month | Model artifacts |
Data Format
Input Format
{
"x_enc": [
[ # Timestep 1
100, # quantity
1000.0, # line_total
10.0, # unit_price
0.05, # discount_percent
0, # is_promotion
0.0, # promo_discount
2, # return_quantity
0.02, # return_rate
0, # is_stockout
0, # is_low_stock
30, # stock_coverage
0, # day_of_week
1, # month
1 # quarter
],
# ... 95 more timesteps
],
"x_mark_enc": [
[1, 1, 0], # month, day, weekday for timestep 1
# ... 95 more timesteps
]
}
Output Format
{
"predictions": [
[ # Predicted timestep 1
105, # forecasted quantity
1050.0, # forecasted line_total
# ... 12 more forecasted features
],
# ... 95 more predicted timesteps
]
}
AWS SageMaker Integration
Training Job Configuration
from sagemaker.pytorch import PyTorch
estimator = PyTorch(
entry_point='train_supply_chain_complete.py',
source_dir='../TimeLLM',
role=sagemaker_role,
instance_type='ml.g5.12xlarge',
instance_count=1,
framework_version='2.0.0',
py_version='py310',
hyperparameters={
'model_name': 'Demand_Forecasting',
'root_path': '/opt/ml/input/data/training'
}
)
Endpoint Configuration
from sagemaker.pytorch import PyTorchModel
model = PyTorchModel(
model_data=model_artifacts_uri,
role=sagemaker_role,
entry_point='inference.py',
framework_version='2.0.0',
py_version='py310'
)
predictor = model.deploy(
initial_instance_count=1,
instance_type='ml.g5.xlarge',
endpoint_name='timellm-demand-forecast-endpoint'
)
Monitoring and Logging
CloudWatch Integration
- Training Logs:
/aws/sagemaker/TrainingJobs/{job-name}
- Endpoint Logs:
/aws/sagemaker/Endpoints/{endpoint-name}
- Custom Metrics: Model performance, latency, error rates
Performance Metrics
Metric | Description |
---|---|
MAE | Mean Absolute Error |
MSE | Mean Squared Error |
MAPE | Mean Absolute Percentage Error |
Latency | Inference response time (~2-3 seconds) |
TimeLLM Framework
This implementation is based on the TimeLLM framework, which introduces:
- π Reprogramming: Converts time series into text prototype representations
- π¬ Prompt Augmentation: Uses declarative prompts for domain knowledge
- π¦ LLM Backbone: Leverages pre-trained language models for forecasting
Key Modifications
- Supply Chain Prompts: Domain-specific prompts for demand forecasting
- HuggingFace Integration: Seamless model loading and tokenization
- AWS Optimization: SageMaker-specific inference handlers
- Performance Tuning: DeepSpeed and mixed precision support
Model Variants
Model | Purpose | Use Case |
---|---|---|
Demand Forecasting | Predict future product demand | Inventory planning, procurement |
Product Forecasting | Product-specific metrics | Product lifecycle management |
Category Forecasting | Electronics category sales | Category management, marketing |
KPI Forecasting | Key performance indicators | Executive dashboards, strategic planning |
Troubleshooting
Common Issues
HuggingFace Access Denied
# Verify token access python -c "from transformers import AutoTokenizer; AutoTokenizer.from_pretrained('meta-llama/Llama-3.2-3B')"
Training Job Fails
# Check CloudWatch logs aws logs describe-log-groups --log-group-name-prefix "/aws/sagemaker/TrainingJobs"
Endpoint Timeout
# Check endpoint status aws sagemaker describe-endpoint --endpoint-name timellm-demand-forecast-endpoint
Citations
This Model
@misc{projectnexus-timellm-2025,
title={TimeLLM Supply Chain Demand Forecasting},
author={Younes Laaroussi},
year={2025},
howpublished={Hugging Face Model Hub},
url={https://huggingface.co/youneslaaroussi/projectnexus},
note={Trained on AWS SageMaker ml.g5.12xlarge using TimeLLM framework}
}
TimeLLM Framework
@inproceedings{jin2023time,
title={{Time-LLM}: Time series forecasting by reprogramming large language models},
author={Jin, Ming and Wang, Shiyu and Ma, Lintao and Chu, Zhixuan and Zhang, James Y and Shi, Xiaoming and Chen, Pin-Yu and Liang, Yuxuan and Li, Yuan-Fang and Pan, Shirui and Wen, Qingsong},
booktitle={International Conference on Learning Representations (ICLR)},
year={2024}
}
License
This model is released under the MIT License, consistent with the TimeLLM framework.
Acknowledgments
- TimeLLM for the foundational framework
- AWS SageMaker for the training infrastructure
- Meta LLaMA for the base model
- HuggingFace for model hosting and transformers library
- DeepSpeed for optimization techniques
- Downloads last month
- 6
Model tree for youneslaaroussi/projectnexus
Base model
meta-llama/Llama-3.2-3B