enrique2701
/

ppo-Pyramids

Reinforcement Learning

deep-reinforcement-learning

ML-Agents-Pyramids

Model card Files Files and versions

Metrics Training metrics Community

enrique2701 commited on Feb 25, 2024

Commit

881b498

·

verified ·

1 Parent(s): 472efe3

Update README.md

Files changed (1) hide show

README.md +39 -0

README.md CHANGED Viewed

@@ -11,6 +11,45 @@ tags:
   This is a trained model of a **ppo** agent playing **Pyramids**
   using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
   ## Usage (with ML-Agents)
   The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/

   This is a trained model of a **ppo** agent playing **Pyramids**
   using the [Unity ML-Agents Library](https://github.com/Unity-Technologies/ml-agents).
+  ## Results
+[INFO] Pyramids. Step: 2320000. Time Elapsed: 4995.783 s. Mean Reward: 1.775. Std of Reward: 0.113.
+  ## Hyperparameters
+  ```yaml
+%%file /content/ml-agents/config/ppo/PyramidsRND.yaml
+behaviors:
+  Pyramids:
+    trainer_type: ppo
+    hyperparameters:
+      batch_size: 252
+      buffer_size: 4096
+      learning_rate: 0.0003
+      beta: 0.01
+      epsilon: 0.2
+      lambd: 0.95
+      num_epoch: 3
+      learning_rate_schedule: linear
+    network_settings:
+      normalize: false
+      hidden_units: 512
+      num_layers: 2
+      vis_encode_type: nature_cnn
+    reward_signals:
+      extrinsic:
+        gamma: 0.99
+        strength: 1.0
+      rnd:
+        gamma: 0.99
+        strength: 0.01
+        network_settings:
+          hidden_units: 64
+          num_layers: 3
+        learning_rate: 0.0001
+    keep_checkpoints: 5
+    max_steps: 3000000
+    time_horizon: 512
+    summary_freq: 10000
+```
   ## Usage (with ML-Agents)
   The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/