model-index: - name: stable-baselines3-ppo-LunarLander-v2
ARCHIVED MODEL, DO NOT USE IT
stable-baselines3-ppo-LunarLander-v2 ππ©βπ
This is a saved model of a PPO agent playing LunarLander-v2. The model is taken from rl-baselines3-zoo
The goal is to correctly land the lander by controlling firing engines (fire left orientation engine, fire main engine and fire right orientation engine).
π You can watch the agent playing by using this notebook
Use the Model
Install the dependencies
You need to use the Stable Baselines 3 Hugging Face version of the library (this version contains the function to load saved models directly from the Hugging Face Hub):
pip install git+https://github.com/simoninithomas/stable-baselines3.git
Evaluate the agent
β οΈYou need to have Linux or MacOS to be able to use this environment. If it's not the case you can use the colab notebook
# Import the libraries
import gym
from stable_baselines3 import PPO
from stable_baselines3.common.evaluation import evaluate_policy
# Load the environment
env = gym.make('LunarLander-v2')
model = PPO.load_from_huggingface(hf_model_id="ThomasSimonini/stable-baselines3-ppo-LunarLander-v2",hf_model_filename="LunarLander-v2")
# Evaluate the agent
eval_env = gym.make('LunarLander-v2')
mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)
print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")
# Watch the agent play
obs = env.reset()
for i in range(1000):
action, _state = model.predict(obs)
obs, reward, done, info = env.step(action)
env.render()
if done:
obs = env.reset()
Results
Mean Reward (10 evaluation episodes): 245.63 +/- 10.02