CartPole-v1_ppo / README.md
tsilva's picture
Upload README.md with huggingface_hub
764d5ad verified
metadata
license: mit
library_name: pytorch
pipeline_tag: reinforcement-learning
language:
  - en
tags:
  - reinforcement-learning
  - gymnasium
  - CartPole-v1
  - ppo
  - pytorch
model-index:
  - name: CartPole-v1_ppo
    results:
      - task:
          type: reinforcement-learning
          name: Reinforcement Learning
        dataset:
          name: CartPole-v1
          type: gymnasium
        metrics:
          - name: Best Eval Reward
            type: reward
            value: 272.3999938964844
          - name: Current Eval Reward
            type: reward
            value: 500
          - name: Epoch
            type: epoch
            value: 199
          - name: Total Timesteps
            type: timesteps
            value: 0

CartPole-v1_ppo

Run: cvb5lyfw — Env: CartPole-v1 — Algo: ppo

This repository contains artifacts from a Gymnasium Solver training run.

Contents

  • Config: artifacts/configs/config.json
  • Checkpoints: artifacts/checkpoints/*.ckpt
  • Logs: artifacts/logs/*.log
  • Video: artifacts/videos/**/best_checkpoint.mp4 (also previewed below)

Preview

If the video above doesn't load, try the fallback: replay.mp4

Config (excerpt)

{
  "env_id": "CartPole-v1",
  "algo_id": "ppo",
  "n_steps": 32,
  "batch_size": 256,
  "n_epochs": 20,
  "n_timesteps": 100000.0,
  "seed": 42,
  "n_envs": 8,
  "obs_type": "rgb",
  "policy": "MlpPolicy",
  "learning_rate": 0.001,
  "gamma": 0.98,
  "gae_lambda": 0.8,
  "ent_coef": 0.0,
  "vf_coef": 0.5,
  "clip_range": 0.2,
  "normalize_advantages": "batch"
}