|
# Orpheus Music Transformer Training Logs |
|
|
|
*** |
|
|
|
## Training hyperparameters for both models (base and bridge) |
|
|
|
- GPU: Single Nvidia GH200 96GB |
|
- learning_rate: 1e-04 |
|
- train_batch_size: 9 |
|
- gradients_accumulation_value: 8 |
|
- gradients_clip_value: 1.0 |
|
- eval_batch_size: 9 |
|
- optimizer: PyTorch Adam with default parameters |
|
- lr_scheduler_type: constant |
|
- precision: bfloat16 |
|
- embeddings: RoPE |
|
- attention: flash |
|
- num_epochs: 3 |
|
|
|
*** |
|
|
|
## [Base model train/eval metrics](https://huggingface.co/asigalov61/Orpheus-Music-Transformer/tree/main/logs/base) |
|
|
|
### Cross-entropy losses/accs (averaged last 100 values) |
|
|
|
- train_loss: 0.6559235458076 |
|
- train_acc: 0.8074536085128784 |
|
- eval_loss: 0.6947438363730908 |
|
- eval_acc: 0.7946965831518173 |
|
|
|
- epochs: 3 |
|
- num_steps: 96332 |
|
- training_time: 194 hours / 8 days |
|
|
|
*** |
|
|
|
### Project Los Angeles |
|
### Tegridy Code 2025 |