asigalov61's picture
Update logs/README.md
2537289 verified
|
raw
history blame
852 Bytes

Orpheus Music Transformer Training Logs


Training hyperparameters for both models (base and bridge)

  • GPU: Single Nvidia GH200 96GB
  • learning_rate: 1e-04
  • train_batch_size: 9
  • gradients_accumulation_value: 8
  • gradients_clip_value: 1.0
  • eval_batch_size: 9
  • optimizer: PyTorch Adam with default parameters
  • lr_scheduler_type: constant
  • precision: bfloat16
  • embeddings: RoPE
  • attention: flash
  • num_epochs: 3

Base model train/eval metrics

Cross-entropy losses/accs (averaged last 100 values)

  • train_loss: 0.6559235458076

  • train_acc: 0.8074536085128784

  • eval_loss: 0.6947438363730908

  • eval_acc: 0.7946965831518173

  • epochs: 3

  • num_steps: 96332

  • training_time: 194 hours / 8 days


Project Los Angeles

Tegridy Code 2025