asigalov61's picture
Update logs/README.md
2537289 verified
|
raw
history blame
852 Bytes
# Orpheus Music Transformer Training Logs
***
## Training hyperparameters for both models (base and bridge)
- GPU: Single Nvidia GH200 96GB
- learning_rate: 1e-04
- train_batch_size: 9
- gradients_accumulation_value: 8
- gradients_clip_value: 1.0
- eval_batch_size: 9
- optimizer: PyTorch Adam with default parameters
- lr_scheduler_type: constant
- precision: bfloat16
- embeddings: RoPE
- attention: flash
- num_epochs: 3
***
## [Base model train/eval metrics](https://huggingface.co/asigalov61/Orpheus-Music-Transformer/tree/main/logs/base)
### Cross-entropy losses/accs (averaged last 100 values)
- train_loss: 0.6559235458076
- train_acc: 0.8074536085128784
- eval_loss: 0.6947438363730908
- eval_acc: 0.7946965831518173
- epochs: 3
- num_steps: 96332
- training_time: 194 hours / 8 days
***
### Project Los Angeles
### Tegridy Code 2025