A2C CartPole Model

This is an A2C (Advantage Actor-Critic) model trained to balance a pole on a moving cart. The model was trained using Stable-Baselines3.

Task Description

The CartPole task involves balancing a pole attached by an unactuated joint to a cart that moves along a frictionless track. The goal is to prevent the pole from falling over by applying forces to the cart. The episode ends when:

  • The pole angle is more than ±12 degrees from vertical
  • The cart position is more than ±2.4 units from the center
  • Or when the episode length reaches 500 steps

Training Details

  • Environment: CartPole-v1
  • Algorithm: A2C (Advantage Actor-Critic)
  • Training Steps: 50,000
  • Policy: MlpPolicy
  • Learning Rate: 0.001
  • N_steps: 5
  • Gamma: 0.99
  • Training Framework: Stable-Baselines3

Usage

import gymnasium as gym
from stable_baselines3 import A2C

# Create environment
env = gym.make("CartPole-v1", render_mode="human")

# Load the trained model
model = A2C.load("StevanLS/a2c-cartpole-v1")

# Test the model
obs, _ = env.reset()
while True:
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, done, truncated, info = env.step(action)
    if done or truncated:
        obs, _ = env.reset()

Author

  • StevanLS

Citations

@article{gymatorium2023,
    author={Farama Foundation},
    title={Gymnasium},
    year={2023},
    journal={GitHub repository},
    publisher={GitHub},
    url={https://github.com/Farama-Foundation/Gymnasium}
}

@article{raffin2021stable,
    title={Stable-baselines3: Reliable reinforcement learning implementations},
    author={Raffin, Antonin and Hill, Ashley and Gleave, Adam and Kanervisto, Anssi and Ernestus, Maximilian and Dormann, Noah},
    journal={Journal of Machine Learning Research},
    year={2021}
}
Downloads last month
3
Video Preview
loading

Evaluation results

  • mean_reward on CartPole-v1
    self-reported
    REPLACE_WITH_ACTUAL_MEAN_REWARD
  • success_rate on CartPole-v1
    self-reported
    REPLACE_WITH_SUCCESS_RATE