File size: 857 Bytes
e644937
 
 
 
16694aa
 
f5a56e1
16694aa
 
 
 
 
 
 
31e399d
 
 
20844d6
16694aa
 
 
72c0722
b377c38
427e9a8
993cf86
20844d6
 
 
 
993cf86
20844d6
 
 
b377c38
 
 
e644937
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# Orpheus Music Transformer Training Logs

***

## Training hyperparameters for both models (base and bridge)

- GPU: Single Nvidia GH200 96GB
- learning_rate: 1e-04
- train_batch_size: 9
- gradients_accumulation_value: 8
- gradients_clip_value: 1.0
- eval_batch_size: 9
- optimizer: PyTorch Adam with default parameters
- lr_scheduler_type: constant
- precision: bfloat16
- embeddings: RoPE
- attention: flash
- num_epochs: 4

***

## [Base model train/eval metrics](https://huggingface.co/asigalov61/Orpheus-Music-Transformer/tree/main/logs/base)

### Cross-entropy losses/accs (averaged last 100 values)

- train_loss: 0.6657808522880078
- train_acc: 0.8036309605836869
- val_loss: 0.685554896891117
- val_acc: 0.7972136563062668

- epochs: 4
- num_steps: 128497
- training_time: 256 hours / 10.66 days

***

### Project Los Angeles
### Tegridy Code 2025