|  | --- | 
					
						
						|  | tags: | 
					
						
						|  | - Walker2d-v4 | 
					
						
						|  | - deep-reinforcement-learning | 
					
						
						|  | - reinforcement-learning | 
					
						
						|  | - custom-implementation | 
					
						
						|  | library_name: cleanrl | 
					
						
						|  | model-index: | 
					
						
						|  | - name: TD3 | 
					
						
						|  | results: | 
					
						
						|  | - task: | 
					
						
						|  | type: reinforcement-learning | 
					
						
						|  | name: reinforcement-learning | 
					
						
						|  | dataset: | 
					
						
						|  | name: Walker2d-v4 | 
					
						
						|  | type: Walker2d-v4 | 
					
						
						|  | metrics: | 
					
						
						|  | - type: mean_reward | 
					
						
						|  | value: 3621.95 +/- 398.07 | 
					
						
						|  | name: mean_reward | 
					
						
						|  | verified: false | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | # (CleanRL) **TD3** Agent Playing **Walker2d-v4** | 
					
						
						|  |  | 
					
						
						|  | This is a trained model of a TD3 agent playing Walker2d-v4. | 
					
						
						|  | The model was trained by using [CleanRL](https://github.com/vwxyzjn/cleanrl) and the most up-to-date training code can be | 
					
						
						|  | found [here](https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/td3_continuous_action.py). | 
					
						
						|  |  | 
					
						
						|  | ## Get Started | 
					
						
						|  |  | 
					
						
						|  | To use this model, please install the `cleanrl` package with the following command: | 
					
						
						|  |  | 
					
						
						|  | ``` | 
					
						
						|  | pip install "cleanrl[td3_continuous_action]" | 
					
						
						|  | python -m cleanrl_utils.enjoy --exp-name td3_continuous_action --env-id Walker2d-v4 | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | Please refer to the [documentation](https://docs.cleanrl.dev/get-started/zoo/) for more detail. | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ## Command to reproduce the training | 
					
						
						|  |  | 
					
						
						|  | ```bash | 
					
						
						|  | curl -OL https://huggingface.co/sdpkjc/Walker2d-v4-td3_continuous_action-seed4/raw/main/td3_continuous_action.py | 
					
						
						|  | curl -OL https://huggingface.co/sdpkjc/Walker2d-v4-td3_continuous_action-seed4/raw/main/pyproject.toml | 
					
						
						|  | curl -OL https://huggingface.co/sdpkjc/Walker2d-v4-td3_continuous_action-seed4/raw/main/poetry.lock | 
					
						
						|  | poetry install --all-extras | 
					
						
						|  | python td3_continuous_action.py --save-model --upload-model --hf-entity sdpkjc --env-id Walker2d-v4 --seed 4 --track | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | # Hyperparameters | 
					
						
						|  | ```python | 
					
						
						|  | {'batch_size': 256, | 
					
						
						|  | 'buffer_size': 1000000, | 
					
						
						|  | 'capture_video': False, | 
					
						
						|  | 'cuda': True, | 
					
						
						|  | 'env_id': 'Walker2d-v4', | 
					
						
						|  | 'exp_name': 'td3_continuous_action', | 
					
						
						|  | 'exploration_noise': 0.1, | 
					
						
						|  | 'gamma': 0.99, | 
					
						
						|  | 'hf_entity': 'sdpkjc', | 
					
						
						|  | 'learning_rate': 0.0003, | 
					
						
						|  | 'learning_starts': 25000.0, | 
					
						
						|  | 'noise_clip': 0.5, | 
					
						
						|  | 'policy_frequency': 2, | 
					
						
						|  | 'policy_noise': 0.2, | 
					
						
						|  | 'save_model': True, | 
					
						
						|  | 'seed': 4, | 
					
						
						|  | 'tau': 0.005, | 
					
						
						|  | 'torch_deterministic': True, | 
					
						
						|  | 'total_timesteps': 1000000, | 
					
						
						|  | 'track': True, | 
					
						
						|  | 'upload_model': True, | 
					
						
						|  | 'wandb_entity': None, | 
					
						
						|  | 'wandb_project_name': 'cleanRL'} | 
					
						
						|  | ``` | 
					
						
						|  |  |