|
[2025-03-02 05:27:37,050][00415] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2025-03-02 05:27:37,051][00415] Rollout worker 0 uses device cpu |
|
[2025-03-02 05:27:37,053][00415] Rollout worker 1 uses device cpu |
|
[2025-03-02 05:27:37,056][00415] Rollout worker 2 uses device cpu |
|
[2025-03-02 05:27:37,057][00415] Rollout worker 3 uses device cpu |
|
[2025-03-02 05:27:37,058][00415] Rollout worker 4 uses device cpu |
|
[2025-03-02 05:27:37,058][00415] Rollout worker 5 uses device cpu |
|
[2025-03-02 05:27:37,059][00415] Rollout worker 6 uses device cpu |
|
[2025-03-02 05:27:37,060][00415] Rollout worker 7 uses device cpu |
|
[2025-03-02 05:27:37,235][00415] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-02 05:27:37,237][00415] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-03-02 05:27:37,277][00415] Starting all processes... |
|
[2025-03-02 05:27:37,278][00415] Starting process learner_proc0 |
|
[2025-03-02 05:27:37,354][00415] Starting all processes... |
|
[2025-03-02 05:27:37,363][00415] Starting process inference_proc0-0 |
|
[2025-03-02 05:27:37,363][00415] Starting process rollout_proc0 |
|
[2025-03-02 05:27:37,364][00415] Starting process rollout_proc1 |
|
[2025-03-02 05:27:37,364][00415] Starting process rollout_proc2 |
|
[2025-03-02 05:27:37,364][00415] Starting process rollout_proc3 |
|
[2025-03-02 05:27:37,364][00415] Starting process rollout_proc4 |
|
[2025-03-02 05:27:37,364][00415] Starting process rollout_proc5 |
|
[2025-03-02 05:27:37,917][00415] Starting process rollout_proc6 |
|
[2025-03-02 05:27:37,919][00415] Starting process rollout_proc7 |
|
[2025-03-02 05:27:54,304][02772] Worker 4 uses CPU cores [0] |
|
[2025-03-02 05:27:54,433][02756] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-02 05:27:54,441][02756] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-03-02 05:27:54,576][02756] Num visible devices: 1 |
|
[2025-03-02 05:27:54,571][02779] Worker 5 uses CPU cores [1] |
|
[2025-03-02 05:27:54,608][02756] Starting seed is not provided |
|
[2025-03-02 05:27:54,609][02756] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-02 05:27:54,609][02756] Initializing actor-critic model on device cuda:0 |
|
[2025-03-02 05:27:54,609][02756] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-02 05:27:54,612][02756] RunningMeanStd input shape: (1,) |
|
[2025-03-02 05:27:54,740][02756] ConvEncoder: input_channels=3 |
|
[2025-03-02 05:27:55,056][02780] Worker 6 uses CPU cores [0] |
|
[2025-03-02 05:27:55,081][02771] Worker 1 uses CPU cores [1] |
|
[2025-03-02 05:27:55,104][02774] Worker 3 uses CPU cores [1] |
|
[2025-03-02 05:27:55,141][02770] Worker 0 uses CPU cores [0] |
|
[2025-03-02 05:27:55,253][02773] Worker 2 uses CPU cores [0] |
|
[2025-03-02 05:27:55,330][02781] Worker 7 uses CPU cores [1] |
|
[2025-03-02 05:27:55,347][02769] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-02 05:27:55,348][02769] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-03-02 05:27:55,394][02769] Num visible devices: 1 |
|
[2025-03-02 05:27:55,431][02756] Conv encoder output size: 512 |
|
[2025-03-02 05:27:55,432][02756] Policy head output size: 512 |
|
[2025-03-02 05:27:55,507][02756] Created Actor Critic model with architecture: |
|
[2025-03-02 05:27:55,508][02756] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-03-02 05:27:55,976][02756] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-03-02 05:27:57,237][00415] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-03-02 05:27:57,247][00415] Heartbeat connected on RolloutWorker_w0 |
|
[2025-03-02 05:27:57,249][00415] Heartbeat connected on RolloutWorker_w1 |
|
[2025-03-02 05:27:57,250][00415] Heartbeat connected on Batcher_0 |
|
[2025-03-02 05:27:57,253][00415] Heartbeat connected on RolloutWorker_w2 |
|
[2025-03-02 05:27:57,257][00415] Heartbeat connected on RolloutWorker_w3 |
|
[2025-03-02 05:27:57,263][00415] Heartbeat connected on RolloutWorker_w4 |
|
[2025-03-02 05:27:57,267][00415] Heartbeat connected on RolloutWorker_w5 |
|
[2025-03-02 05:27:57,276][00415] Heartbeat connected on RolloutWorker_w7 |
|
[2025-03-02 05:27:57,278][00415] Heartbeat connected on RolloutWorker_w6 |
|
[2025-03-02 05:28:00,109][02756] No checkpoints found |
|
[2025-03-02 05:28:00,109][02756] Did not load from checkpoint, starting from scratch! |
|
[2025-03-02 05:28:00,109][02756] Initialized policy 0 weights for model version 0 |
|
[2025-03-02 05:28:00,112][02756] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-02 05:28:00,119][02756] LearnerWorker_p0 finished initialization! |
|
[2025-03-02 05:28:00,126][00415] Heartbeat connected on LearnerWorker_p0 |
|
[2025-03-02 05:28:00,264][02769] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-02 05:28:00,265][02769] RunningMeanStd input shape: (1,) |
|
[2025-03-02 05:28:00,277][02769] ConvEncoder: input_channels=3 |
|
[2025-03-02 05:28:00,377][02769] Conv encoder output size: 512 |
|
[2025-03-02 05:28:00,377][02769] Policy head output size: 512 |
|
[2025-03-02 05:28:00,412][00415] Inference worker 0-0 is ready! |
|
[2025-03-02 05:28:00,413][00415] All inference workers are ready! Signal rollout workers to start! |
|
[2025-03-02 05:28:00,684][02774] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,702][02773] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,701][02771] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,711][02772] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,735][02770] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,767][02779] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,776][02781] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:00,790][02780] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:28:01,952][00415] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-02 05:28:02,452][02780] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:02,454][02770] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:02,456][02773] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:02,454][02779] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:02,457][02771] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:03,966][02771] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:03,961][02774] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:03,970][02781] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:04,169][02773] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:04,172][02770] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:04,215][02772] Decorrelating experience for 0 frames... |
|
[2025-03-02 05:28:05,381][02774] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:05,548][02781] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:05,625][02773] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:05,628][02770] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:05,834][02771] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:06,310][02780] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:06,467][02779] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:06,823][02773] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:06,860][02771] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:06,952][00415] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-02 05:28:07,389][02772] Decorrelating experience for 32 frames... |
|
[2025-03-02 05:28:07,631][02780] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:08,333][02781] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:08,591][02772] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:08,595][02780] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:09,193][02779] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:10,063][02774] Decorrelating experience for 64 frames... |
|
[2025-03-02 05:28:10,222][02772] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:11,870][02781] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:11,952][00415] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 59.6. Samples: 596. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-02 05:28:11,953][00415] Avg episode reward: [(0, '2.657')] |
|
[2025-03-02 05:28:12,644][02779] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:12,810][02756] Signal inference workers to stop experience collection... |
|
[2025-03-02 05:28:12,820][02769] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-03-02 05:28:13,022][02774] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:13,135][02770] Decorrelating experience for 96 frames... |
|
[2025-03-02 05:28:14,165][02756] Signal inference workers to resume experience collection... |
|
[2025-03-02 05:28:14,166][02769] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-03-02 05:28:16,952][00415] Fps is (10 sec: 1228.8, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 12288. Throughput: 0: 204.9. Samples: 3074. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) |
|
[2025-03-02 05:28:16,955][00415] Avg episode reward: [(0, '3.165')] |
|
[2025-03-02 05:28:21,952][00415] Fps is (10 sec: 2867.1, 60 sec: 1433.6, 300 sec: 1433.6). Total num frames: 28672. Throughput: 0: 375.2. Samples: 7504. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:28:21,953][00415] Avg episode reward: [(0, '3.595')] |
|
[2025-03-02 05:28:24,929][02769] Updated weights for policy 0, policy_version 10 (0.0151) |
|
[2025-03-02 05:28:26,952][00415] Fps is (10 sec: 3686.3, 60 sec: 1966.1, 300 sec: 1966.1). Total num frames: 49152. Throughput: 0: 420.2. Samples: 10506. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2025-03-02 05:28:26,953][00415] Avg episode reward: [(0, '4.170')] |
|
[2025-03-02 05:28:31,952][00415] Fps is (10 sec: 3686.5, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 65536. Throughput: 0: 569.3. Samples: 17080. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:28:31,953][00415] Avg episode reward: [(0, '4.350')] |
|
[2025-03-02 05:28:35,595][02769] Updated weights for policy 0, policy_version 20 (0.0032) |
|
[2025-03-02 05:28:36,952][00415] Fps is (10 sec: 3686.5, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 86016. Throughput: 0: 633.1. Samples: 22160. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:28:36,953][00415] Avg episode reward: [(0, '4.425')] |
|
[2025-03-02 05:28:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 2764.8, 300 sec: 2764.8). Total num frames: 110592. Throughput: 0: 643.2. Samples: 25726. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:28:41,953][00415] Avg episode reward: [(0, '4.399')] |
|
[2025-03-02 05:28:41,958][02756] Saving new best policy, reward=4.399! |
|
[2025-03-02 05:28:44,146][02769] Updated weights for policy 0, policy_version 30 (0.0025) |
|
[2025-03-02 05:28:46,952][00415] Fps is (10 sec: 4505.6, 60 sec: 2912.7, 300 sec: 2912.7). Total num frames: 131072. Throughput: 0: 714.7. Samples: 32160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:28:46,954][00415] Avg episode reward: [(0, '4.402')] |
|
[2025-03-02 05:28:46,959][02756] Saving new best policy, reward=4.402! |
|
[2025-03-02 05:28:51,952][00415] Fps is (10 sec: 3686.3, 60 sec: 2949.1, 300 sec: 2949.1). Total num frames: 147456. Throughput: 0: 832.8. Samples: 37476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:28:51,955][00415] Avg episode reward: [(0, '4.449')] |
|
[2025-03-02 05:28:51,962][02756] Saving new best policy, reward=4.449! |
|
[2025-03-02 05:28:55,329][02769] Updated weights for policy 0, policy_version 40 (0.0019) |
|
[2025-03-02 05:28:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3053.4, 300 sec: 3053.4). Total num frames: 167936. Throughput: 0: 893.0. Samples: 40782. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:28:56,956][00415] Avg episode reward: [(0, '4.477')] |
|
[2025-03-02 05:28:57,001][02756] Saving new best policy, reward=4.477! |
|
[2025-03-02 05:29:01,952][00415] Fps is (10 sec: 4096.1, 60 sec: 3140.3, 300 sec: 3140.3). Total num frames: 188416. Throughput: 0: 975.1. Samples: 46954. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:29:01,953][00415] Avg episode reward: [(0, '4.459')] |
|
[2025-03-02 05:29:06,454][02769] Updated weights for policy 0, policy_version 50 (0.0014) |
|
[2025-03-02 05:29:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3150.8). Total num frames: 204800. Throughput: 0: 988.9. Samples: 52002. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:29:06,954][00415] Avg episode reward: [(0, '4.364')] |
|
[2025-03-02 05:29:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3822.9, 300 sec: 3276.8). Total num frames: 229376. Throughput: 0: 999.5. Samples: 55484. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2025-03-02 05:29:11,955][00415] Avg episode reward: [(0, '4.410')] |
|
[2025-03-02 05:29:15,680][02769] Updated weights for policy 0, policy_version 60 (0.0024) |
|
[2025-03-02 05:29:16,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3276.8). Total num frames: 245760. Throughput: 0: 993.2. Samples: 61774. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:29:16,954][00415] Avg episode reward: [(0, '4.435')] |
|
[2025-03-02 05:29:21,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3328.0). Total num frames: 266240. Throughput: 0: 1001.6. Samples: 67234. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:29:21,959][00415] Avg episode reward: [(0, '4.363')] |
|
[2025-03-02 05:29:26,109][02769] Updated weights for policy 0, policy_version 70 (0.0018) |
|
[2025-03-02 05:29:26,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3373.2). Total num frames: 286720. Throughput: 0: 994.7. Samples: 70488. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:29:26,955][00415] Avg episode reward: [(0, '4.567')] |
|
[2025-03-02 05:29:27,052][02756] Saving new best policy, reward=4.567! |
|
[2025-03-02 05:29:31,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3367.8). Total num frames: 303104. Throughput: 0: 981.0. Samples: 76306. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:29:31,955][00415] Avg episode reward: [(0, '4.685')] |
|
[2025-03-02 05:29:32,065][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000075_307200.pth... |
|
[2025-03-02 05:29:32,214][02756] Saving new best policy, reward=4.685! |
|
[2025-03-02 05:29:36,911][02769] Updated weights for policy 0, policy_version 80 (0.0045) |
|
[2025-03-02 05:29:36,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3449.3). Total num frames: 327680. Throughput: 0: 990.5. Samples: 82050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:29:36,955][00415] Avg episode reward: [(0, '4.705')] |
|
[2025-03-02 05:29:36,958][02756] Saving new best policy, reward=4.705! |
|
[2025-03-02 05:29:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3481.6). Total num frames: 348160. Throughput: 0: 994.8. Samples: 85546. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:29:41,955][00415] Avg episode reward: [(0, '4.544')] |
|
[2025-03-02 05:29:46,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3471.8). Total num frames: 364544. Throughput: 0: 992.2. Samples: 91604. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2025-03-02 05:29:46,959][00415] Avg episode reward: [(0, '4.468')] |
|
[2025-03-02 05:29:47,180][02769] Updated weights for policy 0, policy_version 90 (0.0034) |
|
[2025-03-02 05:29:51,953][00415] Fps is (10 sec: 4095.3, 60 sec: 4027.6, 300 sec: 3537.4). Total num frames: 389120. Throughput: 0: 1010.5. Samples: 97476. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:29:51,957][00415] Avg episode reward: [(0, '4.598')] |
|
[2025-03-02 05:29:56,045][02769] Updated weights for policy 0, policy_version 100 (0.0016) |
|
[2025-03-02 05:29:56,959][00415] Fps is (10 sec: 4502.4, 60 sec: 4027.3, 300 sec: 3561.5). Total num frames: 409600. Throughput: 0: 1008.7. Samples: 100884. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:29:56,961][00415] Avg episode reward: [(0, '4.718')] |
|
[2025-03-02 05:29:57,009][02756] Saving new best policy, reward=4.718! |
|
[2025-03-02 05:30:01,952][00415] Fps is (10 sec: 3687.0, 60 sec: 3959.5, 300 sec: 3549.9). Total num frames: 425984. Throughput: 0: 998.3. Samples: 106696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:30:01,955][00415] Avg episode reward: [(0, '4.694')] |
|
[2025-03-02 05:30:06,840][02769] Updated weights for policy 0, policy_version 110 (0.0029) |
|
[2025-03-02 05:30:06,952][00415] Fps is (10 sec: 4098.9, 60 sec: 4096.0, 300 sec: 3604.5). Total num frames: 450560. Throughput: 0: 1013.7. Samples: 112852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:30:06,956][00415] Avg episode reward: [(0, '4.664')] |
|
[2025-03-02 05:30:11,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3623.4). Total num frames: 471040. Throughput: 0: 1018.8. Samples: 116336. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:30:11,955][00415] Avg episode reward: [(0, '4.537')] |
|
[2025-03-02 05:30:16,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3610.5). Total num frames: 487424. Throughput: 0: 1015.8. Samples: 122018. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:30:16,954][00415] Avg episode reward: [(0, '4.562')] |
|
[2025-03-02 05:30:17,331][02769] Updated weights for policy 0, policy_version 120 (0.0020) |
|
[2025-03-02 05:30:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3657.1). Total num frames: 512000. Throughput: 0: 1030.2. Samples: 128408. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:30:21,956][00415] Avg episode reward: [(0, '4.513')] |
|
[2025-03-02 05:30:26,034][02769] Updated weights for policy 0, policy_version 130 (0.0027) |
|
[2025-03-02 05:30:26,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3672.3). Total num frames: 532480. Throughput: 0: 1028.1. Samples: 131810. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:30:26,956][00415] Avg episode reward: [(0, '4.606')] |
|
[2025-03-02 05:30:31,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3659.1). Total num frames: 548864. Throughput: 0: 1012.7. Samples: 137176. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:30:31,953][00415] Avg episode reward: [(0, '4.651')] |
|
[2025-03-02 05:30:36,694][02769] Updated weights for policy 0, policy_version 140 (0.0028) |
|
[2025-03-02 05:30:36,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3699.6). Total num frames: 573440. Throughput: 0: 1027.1. Samples: 143694. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:30:36,953][00415] Avg episode reward: [(0, '4.665')] |
|
[2025-03-02 05:30:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3712.0). Total num frames: 593920. Throughput: 0: 1031.2. Samples: 147282. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:30:41,953][00415] Avg episode reward: [(0, '4.564')] |
|
[2025-03-02 05:30:46,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3698.8). Total num frames: 610304. Throughput: 0: 1016.7. Samples: 152448. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:30:46,953][00415] Avg episode reward: [(0, '4.530')] |
|
[2025-03-02 05:30:47,254][02769] Updated weights for policy 0, policy_version 150 (0.0036) |
|
[2025-03-02 05:30:51,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4096.1, 300 sec: 3734.6). Total num frames: 634880. Throughput: 0: 1025.7. Samples: 159010. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:30:51,953][00415] Avg episode reward: [(0, '4.645')] |
|
[2025-03-02 05:30:56,375][02769] Updated weights for policy 0, policy_version 160 (0.0019) |
|
[2025-03-02 05:30:56,952][00415] Fps is (10 sec: 4505.5, 60 sec: 4096.5, 300 sec: 3744.9). Total num frames: 655360. Throughput: 0: 1022.9. Samples: 162368. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:30:56,958][00415] Avg episode reward: [(0, '4.504')] |
|
[2025-03-02 05:31:01,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3731.9). Total num frames: 671744. Throughput: 0: 1009.1. Samples: 167428. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:31:01,953][00415] Avg episode reward: [(0, '4.690')] |
|
[2025-03-02 05:31:06,815][02769] Updated weights for policy 0, policy_version 170 (0.0026) |
|
[2025-03-02 05:31:06,952][00415] Fps is (10 sec: 4096.1, 60 sec: 4096.0, 300 sec: 3763.9). Total num frames: 696320. Throughput: 0: 1016.5. Samples: 174150. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:06,953][00415] Avg episode reward: [(0, '4.739')] |
|
[2025-03-02 05:31:06,954][02756] Saving new best policy, reward=4.739! |
|
[2025-03-02 05:31:11,952][00415] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 3772.6). Total num frames: 716800. Throughput: 0: 1018.9. Samples: 177660. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:11,953][00415] Avg episode reward: [(0, '4.811')] |
|
[2025-03-02 05:31:11,967][02756] Saving new best policy, reward=4.811! |
|
[2025-03-02 05:31:16,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3759.9). Total num frames: 733184. Throughput: 0: 1010.8. Samples: 182660. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:31:16,953][00415] Avg episode reward: [(0, '4.990')] |
|
[2025-03-02 05:31:16,957][02756] Saving new best policy, reward=4.990! |
|
[2025-03-02 05:31:17,427][02769] Updated weights for policy 0, policy_version 180 (0.0016) |
|
[2025-03-02 05:31:21,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4096.0, 300 sec: 3788.8). Total num frames: 757760. Throughput: 0: 1018.7. Samples: 189534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:21,955][00415] Avg episode reward: [(0, '5.488')] |
|
[2025-03-02 05:31:21,962][02756] Saving new best policy, reward=5.488! |
|
[2025-03-02 05:31:26,718][02769] Updated weights for policy 0, policy_version 190 (0.0014) |
|
[2025-03-02 05:31:26,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3796.3). Total num frames: 778240. Throughput: 0: 1013.1. Samples: 192870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:26,955][00415] Avg episode reward: [(0, '5.566')] |
|
[2025-03-02 05:31:26,959][02756] Saving new best policy, reward=5.566! |
|
[2025-03-02 05:31:31,952][00415] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 3764.4). Total num frames: 790528. Throughput: 0: 1002.5. Samples: 197562. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:31:31,956][00415] Avg episode reward: [(0, '5.623')] |
|
[2025-03-02 05:31:32,025][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000194_794624.pth... |
|
[2025-03-02 05:31:32,166][02756] Saving new best policy, reward=5.623! |
|
[2025-03-02 05:31:36,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3791.2). Total num frames: 815104. Throughput: 0: 1008.1. Samples: 204372. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:36,956][00415] Avg episode reward: [(0, '5.363')] |
|
[2025-03-02 05:31:37,310][02769] Updated weights for policy 0, policy_version 200 (0.0016) |
|
[2025-03-02 05:31:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3798.1). Total num frames: 835584. Throughput: 0: 1013.3. Samples: 207966. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:31:41,954][00415] Avg episode reward: [(0, '5.526')] |
|
[2025-03-02 05:31:46,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3786.5). Total num frames: 851968. Throughput: 0: 1008.2. Samples: 212798. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:31:46,953][00415] Avg episode reward: [(0, '5.804')] |
|
[2025-03-02 05:31:47,030][02756] Saving new best policy, reward=5.804! |
|
[2025-03-02 05:31:48,067][02769] Updated weights for policy 0, policy_version 210 (0.0031) |
|
[2025-03-02 05:31:51,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 3811.1). Total num frames: 876544. Throughput: 0: 1006.5. Samples: 219442. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:31:51,953][00415] Avg episode reward: [(0, '6.194')] |
|
[2025-03-02 05:31:51,958][02756] Saving new best policy, reward=6.194! |
|
[2025-03-02 05:31:56,954][00415] Fps is (10 sec: 4094.9, 60 sec: 3959.3, 300 sec: 3799.7). Total num frames: 892928. Throughput: 0: 1001.7. Samples: 222740. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:31:56,956][00415] Avg episode reward: [(0, '6.200')] |
|
[2025-03-02 05:31:56,961][02756] Saving new best policy, reward=6.200! |
|
[2025-03-02 05:31:59,070][02769] Updated weights for policy 0, policy_version 220 (0.0029) |
|
[2025-03-02 05:32:01,952][00415] Fps is (10 sec: 3276.9, 60 sec: 3959.5, 300 sec: 3788.8). Total num frames: 909312. Throughput: 0: 990.2. Samples: 227218. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:32:01,958][00415] Avg episode reward: [(0, '6.439')] |
|
[2025-03-02 05:32:01,989][02756] Saving new best policy, reward=6.439! |
|
[2025-03-02 05:32:06,952][00415] Fps is (10 sec: 4097.1, 60 sec: 3959.5, 300 sec: 3811.8). Total num frames: 933888. Throughput: 0: 992.8. Samples: 234208. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:32:06,953][00415] Avg episode reward: [(0, '6.228')] |
|
[2025-03-02 05:32:08,066][02769] Updated weights for policy 0, policy_version 230 (0.0017) |
|
[2025-03-02 05:32:11,952][00415] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3817.5). Total num frames: 954368. Throughput: 0: 998.1. Samples: 237784. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:32:11,955][00415] Avg episode reward: [(0, '6.153')] |
|
[2025-03-02 05:32:16,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3822.9). Total num frames: 974848. Throughput: 0: 1003.0. Samples: 242696. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:32:16,955][00415] Avg episode reward: [(0, '6.549')] |
|
[2025-03-02 05:32:16,957][02756] Saving new best policy, reward=6.549! |
|
[2025-03-02 05:32:18,640][02769] Updated weights for policy 0, policy_version 240 (0.0030) |
|
[2025-03-02 05:32:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3828.2). Total num frames: 995328. Throughput: 0: 1008.0. Samples: 249734. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:32:21,955][00415] Avg episode reward: [(0, '6.233')] |
|
[2025-03-02 05:32:26,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3833.2). Total num frames: 1015808. Throughput: 0: 1006.1. Samples: 253240. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:32:26,957][00415] Avg episode reward: [(0, '5.710')] |
|
[2025-03-02 05:32:29,131][02769] Updated weights for policy 0, policy_version 250 (0.0017) |
|
[2025-03-02 05:32:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3838.1). Total num frames: 1036288. Throughput: 0: 1006.1. Samples: 258074. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:32:31,957][00415] Avg episode reward: [(0, '6.484')] |
|
[2025-03-02 05:32:36,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 3857.7). Total num frames: 1060864. Throughput: 0: 1016.6. Samples: 265190. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:32:36,953][00415] Avg episode reward: [(0, '6.491')] |
|
[2025-03-02 05:32:37,769][02769] Updated weights for policy 0, policy_version 260 (0.0033) |
|
[2025-03-02 05:32:41,957][00415] Fps is (10 sec: 4093.8, 60 sec: 4027.4, 300 sec: 3847.2). Total num frames: 1077248. Throughput: 0: 1020.4. Samples: 268662. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:32:41,959][00415] Avg episode reward: [(0, '6.360')] |
|
[2025-03-02 05:32:46,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3851.7). Total num frames: 1097728. Throughput: 0: 1035.3. Samples: 273808. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:32:46,955][00415] Avg episode reward: [(0, '7.062')] |
|
[2025-03-02 05:32:46,957][02756] Saving new best policy, reward=7.062! |
|
[2025-03-02 05:32:48,168][02769] Updated weights for policy 0, policy_version 270 (0.0017) |
|
[2025-03-02 05:32:51,952][00415] Fps is (10 sec: 4508.0, 60 sec: 4096.0, 300 sec: 3870.0). Total num frames: 1122304. Throughput: 0: 1036.6. Samples: 280856. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:32:51,957][00415] Avg episode reward: [(0, '7.371')] |
|
[2025-03-02 05:32:51,963][02756] Saving new best policy, reward=7.371! |
|
[2025-03-02 05:32:56,953][00415] Fps is (10 sec: 4095.3, 60 sec: 4096.1, 300 sec: 3859.9). Total num frames: 1138688. Throughput: 0: 1027.7. Samples: 284030. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:32:56,955][00415] Avg episode reward: [(0, '7.331')] |
|
[2025-03-02 05:32:58,886][02769] Updated weights for policy 0, policy_version 280 (0.0028) |
|
[2025-03-02 05:33:01,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4164.3, 300 sec: 3929.4). Total num frames: 1159168. Throughput: 0: 1030.8. Samples: 289084. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:33:01,953][00415] Avg episode reward: [(0, '7.360')] |
|
[2025-03-02 05:33:06,952][00415] Fps is (10 sec: 4506.4, 60 sec: 4164.3, 300 sec: 4012.7). Total num frames: 1183744. Throughput: 0: 1030.9. Samples: 296124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:06,953][00415] Avg episode reward: [(0, '8.514')] |
|
[2025-03-02 05:33:06,957][02756] Saving new best policy, reward=8.514! |
|
[2025-03-02 05:33:07,737][02769] Updated weights for policy 0, policy_version 290 (0.0012) |
|
[2025-03-02 05:33:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 1200128. Throughput: 0: 1019.4. Samples: 299114. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:33:11,953][00415] Avg episode reward: [(0, '7.943')] |
|
[2025-03-02 05:33:16,952][00415] Fps is (10 sec: 3686.3, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 1220608. Throughput: 0: 1027.5. Samples: 304312. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:33:16,953][00415] Avg episode reward: [(0, '6.933')] |
|
[2025-03-02 05:33:18,294][02769] Updated weights for policy 0, policy_version 300 (0.0020) |
|
[2025-03-02 05:33:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 1241088. Throughput: 0: 1023.8. Samples: 311260. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:33:21,956][00415] Avg episode reward: [(0, '6.539')] |
|
[2025-03-02 05:33:26,952][00415] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1257472. Throughput: 0: 1013.7. Samples: 314272. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:33:26,954][00415] Avg episode reward: [(0, '6.292')] |
|
[2025-03-02 05:33:29,319][02769] Updated weights for policy 0, policy_version 310 (0.0023) |
|
[2025-03-02 05:33:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 1282048. Throughput: 0: 1015.5. Samples: 319506. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:33:31,953][00415] Avg episode reward: [(0, '6.332')] |
|
[2025-03-02 05:33:31,964][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000313_1282048.pth... |
|
[2025-03-02 05:33:32,094][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000075_307200.pth |
|
[2025-03-02 05:33:36,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1302528. Throughput: 0: 1014.5. Samples: 326510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:36,956][00415] Avg episode reward: [(0, '6.331')] |
|
[2025-03-02 05:33:37,946][02769] Updated weights for policy 0, policy_version 320 (0.0017) |
|
[2025-03-02 05:33:41,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4028.1, 300 sec: 4026.6). Total num frames: 1318912. Throughput: 0: 1008.8. Samples: 329426. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:41,957][00415] Avg episode reward: [(0, '6.466')] |
|
[2025-03-02 05:33:46,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 1343488. Throughput: 0: 1016.2. Samples: 334812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:46,953][00415] Avg episode reward: [(0, '7.404')] |
|
[2025-03-02 05:33:48,553][02769] Updated weights for policy 0, policy_version 330 (0.0029) |
|
[2025-03-02 05:33:51,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1363968. Throughput: 0: 1017.9. Samples: 341930. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:51,955][00415] Avg episode reward: [(0, '8.597')] |
|
[2025-03-02 05:33:51,964][02756] Saving new best policy, reward=8.597! |
|
[2025-03-02 05:33:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 4040.5). Total num frames: 1380352. Throughput: 0: 1010.9. Samples: 344606. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:33:56,955][00415] Avg episode reward: [(0, '8.863')] |
|
[2025-03-02 05:33:56,958][02756] Saving new best policy, reward=8.863! |
|
[2025-03-02 05:33:59,545][02769] Updated weights for policy 0, policy_version 340 (0.0028) |
|
[2025-03-02 05:34:01,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1400832. Throughput: 0: 1009.8. Samples: 349752. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:34:01,955][00415] Avg episode reward: [(0, '9.384')] |
|
[2025-03-02 05:34:01,962][02756] Saving new best policy, reward=9.384! |
|
[2025-03-02 05:34:06,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1425408. Throughput: 0: 1004.6. Samples: 356466. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:34:06,956][00415] Avg episode reward: [(0, '9.581')] |
|
[2025-03-02 05:34:06,961][02756] Saving new best policy, reward=9.581! |
|
[2025-03-02 05:34:09,228][02769] Updated weights for policy 0, policy_version 350 (0.0016) |
|
[2025-03-02 05:34:11,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4040.5). Total num frames: 1437696. Throughput: 0: 995.4. Samples: 359064. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:34:11,954][00415] Avg episode reward: [(0, '10.295')] |
|
[2025-03-02 05:34:12,014][02756] Saving new best policy, reward=10.295! |
|
[2025-03-02 05:34:16,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1462272. Throughput: 0: 1004.9. Samples: 364728. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:34:16,953][00415] Avg episode reward: [(0, '10.770')] |
|
[2025-03-02 05:34:16,955][02756] Saving new best policy, reward=10.770! |
|
[2025-03-02 05:34:19,282][02769] Updated weights for policy 0, policy_version 360 (0.0033) |
|
[2025-03-02 05:34:21,952][00415] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 1486848. Throughput: 0: 1003.3. Samples: 371658. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:34:21,953][00415] Avg episode reward: [(0, '10.631')] |
|
[2025-03-02 05:34:26,955][00415] Fps is (10 sec: 3685.3, 60 sec: 4027.5, 300 sec: 4054.3). Total num frames: 1499136. Throughput: 0: 996.2. Samples: 374256. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:34:26,957][00415] Avg episode reward: [(0, '10.767')] |
|
[2025-03-02 05:34:29,952][02769] Updated weights for policy 0, policy_version 370 (0.0023) |
|
[2025-03-02 05:34:31,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1523712. Throughput: 0: 1002.4. Samples: 379918. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:34:31,956][00415] Avg episode reward: [(0, '10.973')] |
|
[2025-03-02 05:34:31,965][02756] Saving new best policy, reward=10.973! |
|
[2025-03-02 05:34:36,952][00415] Fps is (10 sec: 4507.0, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1544192. Throughput: 0: 999.4. Samples: 386904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:34:36,953][00415] Avg episode reward: [(0, '11.776')] |
|
[2025-03-02 05:34:36,954][02756] Saving new best policy, reward=11.776! |
|
[2025-03-02 05:34:39,764][02769] Updated weights for policy 0, policy_version 380 (0.0012) |
|
[2025-03-02 05:34:41,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1560576. Throughput: 0: 996.4. Samples: 389446. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:34:41,955][00415] Avg episode reward: [(0, '12.286')] |
|
[2025-03-02 05:34:41,965][02756] Saving new best policy, reward=12.286! |
|
[2025-03-02 05:34:46,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4054.4). Total num frames: 1585152. Throughput: 0: 1010.4. Samples: 395222. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:34:46,953][00415] Avg episode reward: [(0, '12.898')] |
|
[2025-03-02 05:34:46,955][02756] Saving new best policy, reward=12.898! |
|
[2025-03-02 05:34:49,637][02769] Updated weights for policy 0, policy_version 390 (0.0018) |
|
[2025-03-02 05:34:51,952][00415] Fps is (10 sec: 4505.4, 60 sec: 4027.7, 300 sec: 4054.4). Total num frames: 1605632. Throughput: 0: 1015.8. Samples: 402176. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:34:51,955][00415] Avg episode reward: [(0, '13.581')] |
|
[2025-03-02 05:34:51,963][02756] Saving new best policy, reward=13.581! |
|
[2025-03-02 05:34:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1622016. Throughput: 0: 1009.1. Samples: 404472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:34:56,954][00415] Avg episode reward: [(0, '13.831')] |
|
[2025-03-02 05:34:56,957][02756] Saving new best policy, reward=13.831! |
|
[2025-03-02 05:35:00,359][02769] Updated weights for policy 0, policy_version 400 (0.0033) |
|
[2025-03-02 05:35:01,952][00415] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1642496. Throughput: 0: 1010.3. Samples: 410192. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:35:01,953][00415] Avg episode reward: [(0, '14.622')] |
|
[2025-03-02 05:35:01,961][02756] Saving new best policy, reward=14.622! |
|
[2025-03-02 05:35:06,952][00415] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 1667072. Throughput: 0: 1011.9. Samples: 417194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:35:06,956][00415] Avg episode reward: [(0, '14.498')] |
|
[2025-03-02 05:35:10,599][02769] Updated weights for policy 0, policy_version 410 (0.0016) |
|
[2025-03-02 05:35:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 1683456. Throughput: 0: 1006.2. Samples: 419530. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:35:11,956][00415] Avg episode reward: [(0, '14.706')] |
|
[2025-03-02 05:35:11,962][02756] Saving new best policy, reward=14.706! |
|
[2025-03-02 05:35:16,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1703936. Throughput: 0: 1013.5. Samples: 425526. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:35:16,953][00415] Avg episode reward: [(0, '14.723')] |
|
[2025-03-02 05:35:16,955][02756] Saving new best policy, reward=14.723! |
|
[2025-03-02 05:35:19,944][02769] Updated weights for policy 0, policy_version 420 (0.0025) |
|
[2025-03-02 05:35:21,954][00415] Fps is (10 sec: 4504.4, 60 sec: 4027.6, 300 sec: 4054.3). Total num frames: 1728512. Throughput: 0: 1011.5. Samples: 432426. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:35:21,958][00415] Avg episode reward: [(0, '14.902')] |
|
[2025-03-02 05:35:21,970][02756] Saving new best policy, reward=14.902! |
|
[2025-03-02 05:35:26,955][00415] Fps is (10 sec: 3685.4, 60 sec: 4027.7, 300 sec: 4040.4). Total num frames: 1740800. Throughput: 0: 1001.9. Samples: 434534. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2025-03-02 05:35:26,957][00415] Avg episode reward: [(0, '15.476')] |
|
[2025-03-02 05:35:26,960][02756] Saving new best policy, reward=15.476! |
|
[2025-03-02 05:35:30,897][02769] Updated weights for policy 0, policy_version 430 (0.0018) |
|
[2025-03-02 05:35:31,952][00415] Fps is (10 sec: 3687.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1765376. Throughput: 0: 1003.5. Samples: 440378. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:35:31,956][00415] Avg episode reward: [(0, '16.339')] |
|
[2025-03-02 05:35:31,963][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000431_1765376.pth... |
|
[2025-03-02 05:35:32,084][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000194_794624.pth |
|
[2025-03-02 05:35:32,097][02756] Saving new best policy, reward=16.339! |
|
[2025-03-02 05:35:36,952][00415] Fps is (10 sec: 4506.8, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1785856. Throughput: 0: 1000.7. Samples: 447206. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:35:36,953][00415] Avg episode reward: [(0, '17.591')] |
|
[2025-03-02 05:35:36,956][02756] Saving new best policy, reward=17.591! |
|
[2025-03-02 05:35:41,763][02769] Updated weights for policy 0, policy_version 440 (0.0022) |
|
[2025-03-02 05:35:41,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1802240. Throughput: 0: 995.0. Samples: 449246. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:35:41,953][00415] Avg episode reward: [(0, '17.282')] |
|
[2025-03-02 05:35:46,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1826816. Throughput: 0: 1008.7. Samples: 455584. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:35:46,953][00415] Avg episode reward: [(0, '16.554')] |
|
[2025-03-02 05:35:50,444][02769] Updated weights for policy 0, policy_version 450 (0.0016) |
|
[2025-03-02 05:35:51,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.8, 300 sec: 4040.5). Total num frames: 1847296. Throughput: 0: 1002.2. Samples: 462294. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:35:51,954][00415] Avg episode reward: [(0, '15.857')] |
|
[2025-03-02 05:35:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1863680. Throughput: 0: 997.6. Samples: 464420. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:35:56,953][00415] Avg episode reward: [(0, '15.698')] |
|
[2025-03-02 05:36:01,132][02769] Updated weights for policy 0, policy_version 460 (0.0017) |
|
[2025-03-02 05:36:01,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 1884160. Throughput: 0: 1003.5. Samples: 470684. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:36:01,953][00415] Avg episode reward: [(0, '15.333')] |
|
[2025-03-02 05:36:06,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1908736. Throughput: 0: 998.6. Samples: 477362. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:36:06,956][00415] Avg episode reward: [(0, '16.515')] |
|
[2025-03-02 05:36:11,820][02769] Updated weights for policy 0, policy_version 470 (0.0014) |
|
[2025-03-02 05:36:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 1925120. Throughput: 0: 997.1. Samples: 479402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:36:11,956][00415] Avg episode reward: [(0, '16.548')] |
|
[2025-03-02 05:36:16,955][00415] Fps is (10 sec: 3685.0, 60 sec: 4027.5, 300 sec: 4026.5). Total num frames: 1945600. Throughput: 0: 1014.9. Samples: 486052. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:36:16,958][00415] Avg episode reward: [(0, '15.864')] |
|
[2025-03-02 05:36:20,648][02769] Updated weights for policy 0, policy_version 480 (0.0013) |
|
[2025-03-02 05:36:21,957][00415] Fps is (10 sec: 4503.4, 60 sec: 4027.6, 300 sec: 4040.4). Total num frames: 1970176. Throughput: 0: 1007.6. Samples: 492554. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:36:21,960][00415] Avg episode reward: [(0, '16.457')] |
|
[2025-03-02 05:36:26,952][00415] Fps is (10 sec: 4097.5, 60 sec: 4096.2, 300 sec: 4054.3). Total num frames: 1986560. Throughput: 0: 1011.2. Samples: 494752. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:36:26,953][00415] Avg episode reward: [(0, '16.406')] |
|
[2025-03-02 05:36:31,133][02769] Updated weights for policy 0, policy_version 490 (0.0035) |
|
[2025-03-02 05:36:31,952][00415] Fps is (10 sec: 3688.2, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2007040. Throughput: 0: 1018.4. Samples: 501410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:36:31,953][00415] Avg episode reward: [(0, '16.095')] |
|
[2025-03-02 05:36:36,952][00415] Fps is (10 sec: 4095.7, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2027520. Throughput: 0: 1007.3. Samples: 507624. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:36:36,954][00415] Avg episode reward: [(0, '15.588')] |
|
[2025-03-02 05:36:41,874][02769] Updated weights for policy 0, policy_version 500 (0.0022) |
|
[2025-03-02 05:36:41,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 2048000. Throughput: 0: 1006.4. Samples: 509710. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:36:41,953][00415] Avg episode reward: [(0, '15.619')] |
|
[2025-03-02 05:36:46,956][00415] Fps is (10 sec: 4094.4, 60 sec: 4027.4, 300 sec: 4040.4). Total num frames: 2068480. Throughput: 0: 1021.0. Samples: 516634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:36:46,958][00415] Avg episode reward: [(0, '14.932')] |
|
[2025-03-02 05:36:51,007][02769] Updated weights for policy 0, policy_version 510 (0.0019) |
|
[2025-03-02 05:36:51,953][00415] Fps is (10 sec: 4095.3, 60 sec: 4027.6, 300 sec: 4054.4). Total num frames: 2088960. Throughput: 0: 1013.3. Samples: 522964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:36:51,959][00415] Avg episode reward: [(0, '15.086')] |
|
[2025-03-02 05:36:56,952][00415] Fps is (10 sec: 4097.9, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 2109440. Throughput: 0: 1015.2. Samples: 525088. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:36:56,956][00415] Avg episode reward: [(0, '16.029')] |
|
[2025-03-02 05:37:01,317][02769] Updated weights for policy 0, policy_version 520 (0.0024) |
|
[2025-03-02 05:37:01,952][00415] Fps is (10 sec: 4096.7, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 2129920. Throughput: 0: 1021.4. Samples: 532010. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:37:01,956][00415] Avg episode reward: [(0, '17.411')] |
|
[2025-03-02 05:37:06,954][00415] Fps is (10 sec: 3685.6, 60 sec: 3959.3, 300 sec: 4040.4). Total num frames: 2146304. Throughput: 0: 1009.4. Samples: 537974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:37:06,956][00415] Avg episode reward: [(0, '18.436')] |
|
[2025-03-02 05:37:06,957][02756] Saving new best policy, reward=18.436! |
|
[2025-03-02 05:37:11,906][02769] Updated weights for policy 0, policy_version 530 (0.0023) |
|
[2025-03-02 05:37:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 2170880. Throughput: 0: 1011.2. Samples: 540254. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:37:11,959][00415] Avg episode reward: [(0, '19.301')] |
|
[2025-03-02 05:37:11,965][02756] Saving new best policy, reward=19.301! |
|
[2025-03-02 05:37:16,952][00415] Fps is (10 sec: 4506.6, 60 sec: 4096.3, 300 sec: 4054.3). Total num frames: 2191360. Throughput: 0: 1017.4. Samples: 547192. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:37:16,957][00415] Avg episode reward: [(0, '19.295')] |
|
[2025-03-02 05:37:21,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.8, 300 sec: 4040.5). Total num frames: 2207744. Throughput: 0: 1005.3. Samples: 552864. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:37:21,953][00415] Avg episode reward: [(0, '19.309')] |
|
[2025-03-02 05:37:21,962][02756] Saving new best policy, reward=19.309! |
|
[2025-03-02 05:37:22,759][02769] Updated weights for policy 0, policy_version 540 (0.0026) |
|
[2025-03-02 05:37:26,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2228224. Throughput: 0: 1010.7. Samples: 555190. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:37:26,953][00415] Avg episode reward: [(0, '18.189')] |
|
[2025-03-02 05:37:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2248704. Throughput: 0: 1002.5. Samples: 561744. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:37:31,958][00415] Avg episode reward: [(0, '18.482')] |
|
[2025-03-02 05:37:31,970][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000549_2248704.pth... |
|
[2025-03-02 05:37:32,095][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000313_1282048.pth |
|
[2025-03-02 05:37:32,518][02769] Updated weights for policy 0, policy_version 550 (0.0015) |
|
[2025-03-02 05:37:36,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2265088. Throughput: 0: 977.6. Samples: 566956. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:37:36,957][00415] Avg episode reward: [(0, '17.652')] |
|
[2025-03-02 05:37:41,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2285568. Throughput: 0: 986.4. Samples: 569474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:37:41,956][00415] Avg episode reward: [(0, '17.302')] |
|
[2025-03-02 05:37:43,458][02769] Updated weights for policy 0, policy_version 560 (0.0021) |
|
[2025-03-02 05:37:46,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4028.1, 300 sec: 4026.6). Total num frames: 2310144. Throughput: 0: 984.1. Samples: 576294. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:37:46,956][00415] Avg episode reward: [(0, '17.119')] |
|
[2025-03-02 05:37:51,953][00415] Fps is (10 sec: 4095.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2326528. Throughput: 0: 975.7. Samples: 581878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:37:51,956][00415] Avg episode reward: [(0, '16.918')] |
|
[2025-03-02 05:37:54,010][02769] Updated weights for policy 0, policy_version 570 (0.0023) |
|
[2025-03-02 05:37:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2347008. Throughput: 0: 989.8. Samples: 584794. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:37:56,954][00415] Avg episode reward: [(0, '15.753')] |
|
[2025-03-02 05:38:01,954][00415] Fps is (10 sec: 4095.5, 60 sec: 3959.3, 300 sec: 4012.7). Total num frames: 2367488. Throughput: 0: 989.4. Samples: 591716. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:38:01,959][00415] Avg episode reward: [(0, '15.433')] |
|
[2025-03-02 05:38:02,871][02769] Updated weights for policy 0, policy_version 580 (0.0022) |
|
[2025-03-02 05:38:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 4012.7). Total num frames: 2383872. Throughput: 0: 981.3. Samples: 597022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:38:06,959][00415] Avg episode reward: [(0, '16.750')] |
|
[2025-03-02 05:38:11,952][00415] Fps is (10 sec: 4097.1, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2408448. Throughput: 0: 994.6. Samples: 599948. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:38:11,953][00415] Avg episode reward: [(0, '17.136')] |
|
[2025-03-02 05:38:13,606][02769] Updated weights for policy 0, policy_version 590 (0.0023) |
|
[2025-03-02 05:38:16,955][00415] Fps is (10 sec: 4913.7, 60 sec: 4027.5, 300 sec: 4040.4). Total num frames: 2433024. Throughput: 0: 1005.0. Samples: 606974. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:38:16,956][00415] Avg episode reward: [(0, '18.167')] |
|
[2025-03-02 05:38:21,952][00415] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2445312. Throughput: 0: 1008.8. Samples: 612354. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:38:21,953][00415] Avg episode reward: [(0, '19.653')] |
|
[2025-03-02 05:38:21,961][02756] Saving new best policy, reward=19.653! |
|
[2025-03-02 05:38:24,032][02769] Updated weights for policy 0, policy_version 600 (0.0024) |
|
[2025-03-02 05:38:26,952][00415] Fps is (10 sec: 3687.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2469888. Throughput: 0: 1018.3. Samples: 615296. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:38:26,953][00415] Avg episode reward: [(0, '19.706')] |
|
[2025-03-02 05:38:26,955][02756] Saving new best policy, reward=19.706! |
|
[2025-03-02 05:38:31,952][00415] Fps is (10 sec: 4505.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2490368. Throughput: 0: 1016.6. Samples: 622042. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:38:31,954][00415] Avg episode reward: [(0, '19.367')] |
|
[2025-03-02 05:38:33,820][02769] Updated weights for policy 0, policy_version 610 (0.0024) |
|
[2025-03-02 05:38:36,952][00415] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2506752. Throughput: 0: 1001.9. Samples: 626962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:38:36,953][00415] Avg episode reward: [(0, '19.988')] |
|
[2025-03-02 05:38:36,957][02756] Saving new best policy, reward=19.988! |
|
[2025-03-02 05:38:41,952][00415] Fps is (10 sec: 3686.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2527232. Throughput: 0: 1008.4. Samples: 630174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:38:41,953][00415] Avg episode reward: [(0, '20.276')] |
|
[2025-03-02 05:38:41,963][02756] Saving new best policy, reward=20.276! |
|
[2025-03-02 05:38:43,961][02769] Updated weights for policy 0, policy_version 620 (0.0020) |
|
[2025-03-02 05:38:46,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2551808. Throughput: 0: 1011.3. Samples: 637220. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:38:46,953][00415] Avg episode reward: [(0, '20.091')] |
|
[2025-03-02 05:38:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 4012.7). Total num frames: 2564096. Throughput: 0: 1003.5. Samples: 642178. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:38:51,953][00415] Avg episode reward: [(0, '20.710')] |
|
[2025-03-02 05:38:51,983][02756] Saving new best policy, reward=20.710! |
|
[2025-03-02 05:38:54,591][02769] Updated weights for policy 0, policy_version 630 (0.0029) |
|
[2025-03-02 05:38:56,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2588672. Throughput: 0: 1012.5. Samples: 645510. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:38:56,953][00415] Avg episode reward: [(0, '21.117')] |
|
[2025-03-02 05:38:56,955][02756] Saving new best policy, reward=21.117! |
|
[2025-03-02 05:39:01,955][00415] Fps is (10 sec: 4913.4, 60 sec: 4095.9, 300 sec: 4026.5). Total num frames: 2613248. Throughput: 0: 1011.9. Samples: 652508. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:39:01,957][00415] Avg episode reward: [(0, '20.761')] |
|
[2025-03-02 05:39:04,340][02769] Updated weights for policy 0, policy_version 640 (0.0021) |
|
[2025-03-02 05:39:06,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 2629632. Throughput: 0: 1001.1. Samples: 657402. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:39:06,956][00415] Avg episode reward: [(0, '20.025')] |
|
[2025-03-02 05:39:11,952][00415] Fps is (10 sec: 3687.8, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2650112. Throughput: 0: 1015.3. Samples: 660984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:39:11,953][00415] Avg episode reward: [(0, '21.556')] |
|
[2025-03-02 05:39:11,958][02756] Saving new best policy, reward=21.556! |
|
[2025-03-02 05:39:13,858][02769] Updated weights for policy 0, policy_version 650 (0.0016) |
|
[2025-03-02 05:39:16,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.9, 300 sec: 4026.6). Total num frames: 2674688. Throughput: 0: 1021.7. Samples: 668016. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:39:16,955][00415] Avg episode reward: [(0, '20.958')] |
|
[2025-03-02 05:39:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 2691072. Throughput: 0: 1022.5. Samples: 672976. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:39:21,953][00415] Avg episode reward: [(0, '21.016')] |
|
[2025-03-02 05:39:24,292][02769] Updated weights for policy 0, policy_version 660 (0.0019) |
|
[2025-03-02 05:39:26,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.8, 300 sec: 4026.6). Total num frames: 2711552. Throughput: 0: 1029.2. Samples: 676488. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:39:26,953][00415] Avg episode reward: [(0, '21.198')] |
|
[2025-03-02 05:39:31,953][00415] Fps is (10 sec: 4505.0, 60 sec: 4095.9, 300 sec: 4040.4). Total num frames: 2736128. Throughput: 0: 1025.4. Samples: 683364. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:39:31,956][00415] Avg episode reward: [(0, '21.497')] |
|
[2025-03-02 05:39:31,969][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000668_2736128.pth... |
|
[2025-03-02 05:39:32,190][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000431_1765376.pth |
|
[2025-03-02 05:39:34,453][02769] Updated weights for policy 0, policy_version 670 (0.0022) |
|
[2025-03-02 05:39:36,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 2752512. Throughput: 0: 1022.4. Samples: 688184. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:39:36,956][00415] Avg episode reward: [(0, '20.697')] |
|
[2025-03-02 05:39:41,952][00415] Fps is (10 sec: 3686.9, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2772992. Throughput: 0: 1027.2. Samples: 691734. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:39:41,955][00415] Avg episode reward: [(0, '21.818')] |
|
[2025-03-02 05:39:41,962][02756] Saving new best policy, reward=21.818! |
|
[2025-03-02 05:39:43,963][02769] Updated weights for policy 0, policy_version 680 (0.0016) |
|
[2025-03-02 05:39:46,957][00415] Fps is (10 sec: 4093.9, 60 sec: 4027.4, 300 sec: 4026.5). Total num frames: 2793472. Throughput: 0: 1022.5. Samples: 698520. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:39:46,967][00415] Avg episode reward: [(0, '21.766')] |
|
[2025-03-02 05:39:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2809856. Throughput: 0: 1017.8. Samples: 703204. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:39:51,956][00415] Avg episode reward: [(0, '20.701')] |
|
[2025-03-02 05:39:54,963][02769] Updated weights for policy 0, policy_version 690 (0.0027) |
|
[2025-03-02 05:39:56,952][00415] Fps is (10 sec: 4098.1, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 2834432. Throughput: 0: 1014.7. Samples: 706644. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:39:56,953][00415] Avg episode reward: [(0, '20.765')] |
|
[2025-03-02 05:40:01,952][00415] Fps is (10 sec: 4505.5, 60 sec: 4028.0, 300 sec: 4026.6). Total num frames: 2854912. Throughput: 0: 1008.5. Samples: 713398. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:40:01,955][00415] Avg episode reward: [(0, '21.736')] |
|
[2025-03-02 05:40:05,782][02769] Updated weights for policy 0, policy_version 700 (0.0020) |
|
[2025-03-02 05:40:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2871296. Throughput: 0: 1001.7. Samples: 718052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:40:06,957][00415] Avg episode reward: [(0, '20.515')] |
|
[2025-03-02 05:40:11,952][00415] Fps is (10 sec: 4096.1, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 2895872. Throughput: 0: 1001.0. Samples: 721532. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:40:11,953][00415] Avg episode reward: [(0, '20.274')] |
|
[2025-03-02 05:40:14,664][02769] Updated weights for policy 0, policy_version 710 (0.0018) |
|
[2025-03-02 05:40:16,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 2912256. Throughput: 0: 1003.4. Samples: 728516. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:40:16,953][00415] Avg episode reward: [(0, '21.016')] |
|
[2025-03-02 05:40:21,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2932736. Throughput: 0: 1003.8. Samples: 733356. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:40:21,953][00415] Avg episode reward: [(0, '21.271')] |
|
[2025-03-02 05:40:25,311][02769] Updated weights for policy 0, policy_version 720 (0.0012) |
|
[2025-03-02 05:40:26,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2953216. Throughput: 0: 1002.3. Samples: 736836. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:40:26,953][00415] Avg episode reward: [(0, '21.214')] |
|
[2025-03-02 05:40:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.6, 300 sec: 4026.6). Total num frames: 2973696. Throughput: 0: 996.9. Samples: 743374. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:40:31,954][00415] Avg episode reward: [(0, '21.146')] |
|
[2025-03-02 05:40:36,107][02769] Updated weights for policy 0, policy_version 730 (0.0018) |
|
[2025-03-02 05:40:36,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2994176. Throughput: 0: 1005.0. Samples: 748430. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-02 05:40:36,953][00415] Avg episode reward: [(0, '21.670')] |
|
[2025-03-02 05:40:41,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3014656. Throughput: 0: 1005.7. Samples: 751900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:40:41,953][00415] Avg episode reward: [(0, '22.506')] |
|
[2025-03-02 05:40:41,957][02756] Saving new best policy, reward=22.506! |
|
[2025-03-02 05:40:45,147][02769] Updated weights for policy 0, policy_version 740 (0.0015) |
|
[2025-03-02 05:40:46,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4028.1, 300 sec: 4026.6). Total num frames: 3035136. Throughput: 0: 1004.0. Samples: 758576. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:40:46,957][00415] Avg episode reward: [(0, '23.364')] |
|
[2025-03-02 05:40:46,958][02756] Saving new best policy, reward=23.364! |
|
[2025-03-02 05:40:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3051520. Throughput: 0: 1010.4. Samples: 763522. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:40:51,958][00415] Avg episode reward: [(0, '23.613')] |
|
[2025-03-02 05:40:51,965][02756] Saving new best policy, reward=23.613! |
|
[2025-03-02 05:40:55,986][02769] Updated weights for policy 0, policy_version 750 (0.0018) |
|
[2025-03-02 05:40:56,952][00415] Fps is (10 sec: 4096.1, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 3076096. Throughput: 0: 1008.2. Samples: 766900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:40:56,953][00415] Avg episode reward: [(0, '24.571')] |
|
[2025-03-02 05:40:56,958][02756] Saving new best policy, reward=24.571! |
|
[2025-03-02 05:41:01,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 3092480. Throughput: 0: 991.4. Samples: 773130. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:41:01,955][00415] Avg episode reward: [(0, '24.168')] |
|
[2025-03-02 05:41:06,925][02769] Updated weights for policy 0, policy_version 760 (0.0036) |
|
[2025-03-02 05:41:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3112960. Throughput: 0: 1000.0. Samples: 778354. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:41:06,953][00415] Avg episode reward: [(0, '23.213')] |
|
[2025-03-02 05:41:11,952][00415] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3133440. Throughput: 0: 1001.4. Samples: 781900. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:41:11,957][00415] Avg episode reward: [(0, '22.209')] |
|
[2025-03-02 05:41:16,024][02769] Updated weights for policy 0, policy_version 770 (0.0013) |
|
[2025-03-02 05:41:16,955][00415] Fps is (10 sec: 4094.5, 60 sec: 4027.5, 300 sec: 4012.7). Total num frames: 3153920. Throughput: 0: 1002.4. Samples: 788486. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:41:16,957][00415] Avg episode reward: [(0, '20.086')] |
|
[2025-03-02 05:41:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3174400. Throughput: 0: 1012.7. Samples: 794000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:41:21,953][00415] Avg episode reward: [(0, '18.553')] |
|
[2025-03-02 05:41:25,951][02769] Updated weights for policy 0, policy_version 780 (0.0013) |
|
[2025-03-02 05:41:26,952][00415] Fps is (10 sec: 4507.1, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3198976. Throughput: 0: 1013.9. Samples: 797526. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:41:26,955][00415] Avg episode reward: [(0, '20.415')] |
|
[2025-03-02 05:41:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3215360. Throughput: 0: 1006.6. Samples: 803874. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:41:31,958][00415] Avg episode reward: [(0, '20.667')] |
|
[2025-03-02 05:41:31,967][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000785_3215360.pth... |
|
[2025-03-02 05:41:32,150][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000549_2248704.pth |
|
[2025-03-02 05:41:36,514][02769] Updated weights for policy 0, policy_version 790 (0.0033) |
|
[2025-03-02 05:41:36,952][00415] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3235840. Throughput: 0: 1018.3. Samples: 809346. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:41:36,957][00415] Avg episode reward: [(0, '21.080')] |
|
[2025-03-02 05:41:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3260416. Throughput: 0: 1021.4. Samples: 812864. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:41:41,957][00415] Avg episode reward: [(0, '22.097')] |
|
[2025-03-02 05:41:45,983][02769] Updated weights for policy 0, policy_version 800 (0.0023) |
|
[2025-03-02 05:41:46,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 3276800. Throughput: 0: 1027.0. Samples: 819344. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:41:46,954][00415] Avg episode reward: [(0, '22.071')] |
|
[2025-03-02 05:41:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3297280. Throughput: 0: 1037.4. Samples: 825038. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:41:51,953][00415] Avg episode reward: [(0, '20.964')] |
|
[2025-03-02 05:41:55,772][02769] Updated weights for policy 0, policy_version 810 (0.0013) |
|
[2025-03-02 05:41:56,952][00415] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3321856. Throughput: 0: 1038.0. Samples: 828612. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0) |
|
[2025-03-02 05:41:56,953][00415] Avg episode reward: [(0, '22.119')] |
|
[2025-03-02 05:42:01,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3338240. Throughput: 0: 1029.8. Samples: 834822. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:42:01,953][00415] Avg episode reward: [(0, '22.075')] |
|
[2025-03-02 05:42:06,333][02769] Updated weights for policy 0, policy_version 820 (0.0030) |
|
[2025-03-02 05:42:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 3358720. Throughput: 0: 1032.4. Samples: 840460. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:42:06,956][00415] Avg episode reward: [(0, '21.651')] |
|
[2025-03-02 05:42:11,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4164.3, 300 sec: 4040.5). Total num frames: 3383296. Throughput: 0: 1032.2. Samples: 843976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-02 05:42:11,953][00415] Avg episode reward: [(0, '20.974')] |
|
[2025-03-02 05:42:16,125][02769] Updated weights for policy 0, policy_version 830 (0.0019) |
|
[2025-03-02 05:42:16,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.3, 300 sec: 4040.5). Total num frames: 3399680. Throughput: 0: 1025.7. Samples: 850030. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:42:16,953][00415] Avg episode reward: [(0, '22.406')] |
|
[2025-03-02 05:42:21,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4040.5). Total num frames: 3420160. Throughput: 0: 1031.5. Samples: 855764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:42:21,953][00415] Avg episode reward: [(0, '20.807')] |
|
[2025-03-02 05:42:25,699][02769] Updated weights for policy 0, policy_version 840 (0.0014) |
|
[2025-03-02 05:42:26,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3444736. Throughput: 0: 1033.2. Samples: 859360. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:42:26,957][00415] Avg episode reward: [(0, '21.042')] |
|
[2025-03-02 05:42:31,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3461120. Throughput: 0: 1020.9. Samples: 865286. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:42:31,953][00415] Avg episode reward: [(0, '19.597')] |
|
[2025-03-02 05:42:36,371][02769] Updated weights for policy 0, policy_version 850 (0.0019) |
|
[2025-03-02 05:42:36,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3481600. Throughput: 0: 1024.1. Samples: 871124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:42:36,953][00415] Avg episode reward: [(0, '20.324')] |
|
[2025-03-02 05:42:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3506176. Throughput: 0: 1023.5. Samples: 874668. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:42:41,953][00415] Avg episode reward: [(0, '20.140')] |
|
[2025-03-02 05:42:46,074][02769] Updated weights for policy 0, policy_version 860 (0.0018) |
|
[2025-03-02 05:42:46,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4054.4). Total num frames: 3522560. Throughput: 0: 1018.1. Samples: 880638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:42:46,953][00415] Avg episode reward: [(0, '20.428')] |
|
[2025-03-02 05:42:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3543040. Throughput: 0: 1026.5. Samples: 886652. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:42:51,953][00415] Avg episode reward: [(0, '20.608')] |
|
[2025-03-02 05:42:55,450][02769] Updated weights for policy 0, policy_version 870 (0.0024) |
|
[2025-03-02 05:42:56,952][00415] Fps is (10 sec: 4505.5, 60 sec: 4096.0, 300 sec: 4068.3). Total num frames: 3567616. Throughput: 0: 1028.7. Samples: 890268. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:42:56,957][00415] Avg episode reward: [(0, '22.198')] |
|
[2025-03-02 05:43:01,952][00415] Fps is (10 sec: 4095.9, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3584000. Throughput: 0: 1024.0. Samples: 896110. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:43:01,954][00415] Avg episode reward: [(0, '21.874')] |
|
[2025-03-02 05:43:06,066][02769] Updated weights for policy 0, policy_version 880 (0.0021) |
|
[2025-03-02 05:43:06,952][00415] Fps is (10 sec: 3686.5, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3604480. Throughput: 0: 1030.6. Samples: 902140. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:43:06,953][00415] Avg episode reward: [(0, '21.242')] |
|
[2025-03-02 05:43:11,952][00415] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 4054.4). Total num frames: 3629056. Throughput: 0: 1030.0. Samples: 905710. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:43:11,953][00415] Avg episode reward: [(0, '21.663')] |
|
[2025-03-02 05:43:16,220][02769] Updated weights for policy 0, policy_version 890 (0.0015) |
|
[2025-03-02 05:43:16,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3645440. Throughput: 0: 1023.5. Samples: 911344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:43:16,953][00415] Avg episode reward: [(0, '22.039')] |
|
[2025-03-02 05:43:21,952][00415] Fps is (10 sec: 4096.0, 60 sec: 4164.3, 300 sec: 4068.2). Total num frames: 3670016. Throughput: 0: 1032.6. Samples: 917592. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:43:21,953][00415] Avg episode reward: [(0, '22.858')] |
|
[2025-03-02 05:43:25,446][02769] Updated weights for policy 0, policy_version 900 (0.0016) |
|
[2025-03-02 05:43:26,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3690496. Throughput: 0: 1030.7. Samples: 921050. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:43:26,955][00415] Avg episode reward: [(0, '23.354')] |
|
[2025-03-02 05:43:31,952][00415] Fps is (10 sec: 3276.8, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 3702784. Throughput: 0: 1015.8. Samples: 926350. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:43:31,956][00415] Avg episode reward: [(0, '23.971')] |
|
[2025-03-02 05:43:31,967][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000905_3706880.pth... |
|
[2025-03-02 05:43:32,100][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000668_2736128.pth |
|
[2025-03-02 05:43:36,528][02769] Updated weights for policy 0, policy_version 910 (0.0014) |
|
[2025-03-02 05:43:36,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3727360. Throughput: 0: 1015.6. Samples: 932352. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:43:36,954][00415] Avg episode reward: [(0, '23.465')] |
|
[2025-03-02 05:43:41,952][00415] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3751936. Throughput: 0: 1013.2. Samples: 935860. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:43:41,953][00415] Avg episode reward: [(0, '22.999')] |
|
[2025-03-02 05:43:46,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4068.2). Total num frames: 3764224. Throughput: 0: 1006.7. Samples: 941412. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:43:46,958][00415] Avg episode reward: [(0, '23.151')] |
|
[2025-03-02 05:43:47,023][02769] Updated weights for policy 0, policy_version 920 (0.0017) |
|
[2025-03-02 05:43:51,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3788800. Throughput: 0: 1015.2. Samples: 947826. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:43:51,954][00415] Avg episode reward: [(0, '23.441')] |
|
[2025-03-02 05:43:55,869][02769] Updated weights for policy 0, policy_version 930 (0.0019) |
|
[2025-03-02 05:43:56,952][00415] Fps is (10 sec: 4915.1, 60 sec: 4096.0, 300 sec: 4068.3). Total num frames: 3813376. Throughput: 0: 1013.2. Samples: 951302. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-02 05:43:56,954][00415] Avg episode reward: [(0, '22.563')] |
|
[2025-03-02 05:44:01,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 3825664. Throughput: 0: 1006.9. Samples: 956656. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-02 05:44:01,960][00415] Avg episode reward: [(0, '21.307')] |
|
[2025-03-02 05:44:06,610][02769] Updated weights for policy 0, policy_version 940 (0.0012) |
|
[2025-03-02 05:44:06,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3850240. Throughput: 0: 1008.1. Samples: 962958. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:44:06,954][00415] Avg episode reward: [(0, '20.813')] |
|
[2025-03-02 05:44:11,952][00415] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3874816. Throughput: 0: 1008.8. Samples: 966448. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:44:11,954][00415] Avg episode reward: [(0, '20.353')] |
|
[2025-03-02 05:44:16,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4054.3). Total num frames: 3887104. Throughput: 0: 1011.5. Samples: 971866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:44:16,956][00415] Avg episode reward: [(0, '20.410')] |
|
[2025-03-02 05:44:17,068][02769] Updated weights for policy 0, policy_version 950 (0.0018) |
|
[2025-03-02 05:44:21,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4068.2). Total num frames: 3911680. Throughput: 0: 1024.1. Samples: 978438. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:44:21,953][00415] Avg episode reward: [(0, '21.013')] |
|
[2025-03-02 05:44:25,763][02769] Updated weights for policy 0, policy_version 960 (0.0017) |
|
[2025-03-02 05:44:26,952][00415] Fps is (10 sec: 4915.2, 60 sec: 4096.0, 300 sec: 4068.3). Total num frames: 3936256. Throughput: 0: 1025.6. Samples: 982012. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-02 05:44:26,954][00415] Avg episode reward: [(0, '22.437')] |
|
[2025-03-02 05:44:31,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4054.3). Total num frames: 3948544. Throughput: 0: 1017.9. Samples: 987216. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:44:31,953][00415] Avg episode reward: [(0, '22.602')] |
|
[2025-03-02 05:44:36,519][02769] Updated weights for policy 0, policy_version 970 (0.0022) |
|
[2025-03-02 05:44:36,952][00415] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 4068.2). Total num frames: 3973120. Throughput: 0: 1020.2. Samples: 993736. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-02 05:44:36,953][00415] Avg episode reward: [(0, '21.853')] |
|
[2025-03-02 05:44:41,952][00415] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 4068.3). Total num frames: 3993600. Throughput: 0: 1020.3. Samples: 997216. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-02 05:44:41,953][00415] Avg episode reward: [(0, '20.738')] |
|
[2025-03-02 05:44:45,070][02756] Stopping Batcher_0... |
|
[2025-03-02 05:44:45,070][02756] Loop batcher_evt_loop terminating... |
|
[2025-03-02 05:44:45,072][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-02 05:44:45,070][00415] Component Batcher_0 stopped! |
|
[2025-03-02 05:44:45,138][02769] Weights refcount: 2 0 |
|
[2025-03-02 05:44:45,141][00415] Component InferenceWorker_p0-w0 stopped! |
|
[2025-03-02 05:44:45,145][02769] Stopping InferenceWorker_p0-w0... |
|
[2025-03-02 05:44:45,147][02769] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-03-02 05:44:45,267][02756] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000785_3215360.pth |
|
[2025-03-02 05:44:45,280][02756] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-02 05:44:45,522][02756] Stopping LearnerWorker_p0... |
|
[2025-03-02 05:44:45,523][00415] Component LearnerWorker_p0 stopped! |
|
[2025-03-02 05:44:45,522][02756] Loop learner_proc0_evt_loop terminating... |
|
[2025-03-02 05:44:45,658][02771] Stopping RolloutWorker_w1... |
|
[2025-03-02 05:44:45,654][00415] Component RolloutWorker_w4 stopped! |
|
[2025-03-02 05:44:45,659][00415] Component RolloutWorker_w1 stopped! |
|
[2025-03-02 05:44:45,660][02772] Stopping RolloutWorker_w4... |
|
[2025-03-02 05:44:45,661][02772] Loop rollout_proc4_evt_loop terminating... |
|
[2025-03-02 05:44:45,659][02771] Loop rollout_proc1_evt_loop terminating... |
|
[2025-03-02 05:44:45,676][00415] Component RolloutWorker_w3 stopped! |
|
[2025-03-02 05:44:45,677][02774] Stopping RolloutWorker_w3... |
|
[2025-03-02 05:44:45,679][02774] Loop rollout_proc3_evt_loop terminating... |
|
[2025-03-02 05:44:45,706][00415] Component RolloutWorker_w0 stopped! |
|
[2025-03-02 05:44:45,707][02770] Stopping RolloutWorker_w0... |
|
[2025-03-02 05:44:45,713][02770] Loop rollout_proc0_evt_loop terminating... |
|
[2025-03-02 05:44:45,732][00415] Component RolloutWorker_w5 stopped! |
|
[2025-03-02 05:44:45,734][02779] Stopping RolloutWorker_w5... |
|
[2025-03-02 05:44:45,736][02779] Loop rollout_proc5_evt_loop terminating... |
|
[2025-03-02 05:44:45,759][00415] Component RolloutWorker_w7 stopped! |
|
[2025-03-02 05:44:45,762][02781] Stopping RolloutWorker_w7... |
|
[2025-03-02 05:44:45,769][02781] Loop rollout_proc7_evt_loop terminating... |
|
[2025-03-02 05:44:45,797][00415] Component RolloutWorker_w2 stopped! |
|
[2025-03-02 05:44:45,797][02773] Stopping RolloutWorker_w2... |
|
[2025-03-02 05:44:45,802][02773] Loop rollout_proc2_evt_loop terminating... |
|
[2025-03-02 05:44:45,816][02780] Stopping RolloutWorker_w6... |
|
[2025-03-02 05:44:45,816][02780] Loop rollout_proc6_evt_loop terminating... |
|
[2025-03-02 05:44:45,816][00415] Component RolloutWorker_w6 stopped! |
|
[2025-03-02 05:44:45,819][00415] Waiting for process learner_proc0 to stop... |
|
[2025-03-02 05:44:47,446][00415] Waiting for process inference_proc0-0 to join... |
|
[2025-03-02 05:44:47,453][00415] Waiting for process rollout_proc0 to join... |
|
[2025-03-02 05:44:49,744][00415] Waiting for process rollout_proc1 to join... |
|
[2025-03-02 05:44:49,787][00415] Waiting for process rollout_proc2 to join... |
|
[2025-03-02 05:44:49,788][00415] Waiting for process rollout_proc3 to join... |
|
[2025-03-02 05:44:49,790][00415] Waiting for process rollout_proc4 to join... |
|
[2025-03-02 05:44:49,791][00415] Waiting for process rollout_proc5 to join... |
|
[2025-03-02 05:44:49,793][00415] Waiting for process rollout_proc6 to join... |
|
[2025-03-02 05:44:49,795][00415] Waiting for process rollout_proc7 to join... |
|
[2025-03-02 05:44:49,797][00415] Batcher 0 profile tree view: |
|
batching: 25.8281, releasing_batches: 0.0321 |
|
[2025-03-02 05:44:49,797][00415] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 395.8200 |
|
update_model: 8.2510 |
|
weight_update: 0.0021 |
|
one_step: 0.0029 |
|
handle_policy_step: 562.9538 |
|
deserialize: 13.6238, stack: 2.9920, obs_to_device_normalize: 119.4444, forward: 287.5309, send_messages: 27.4691 |
|
prepare_outputs: 87.4571 |
|
to_cpu: 54.0201 |
|
[2025-03-02 05:44:49,799][00415] Learner 0 profile tree view: |
|
misc: 0.0039, prepare_batch: 12.6582 |
|
train: 72.2950 |
|
epoch_init: 0.0146, minibatch_init: 0.0053, losses_postprocess: 0.6386, kl_divergence: 0.6297, after_optimizer: 33.7573 |
|
calculate_losses: 25.4526 |
|
losses_init: 0.0039, forward_head: 1.3671, bptt_initial: 16.6656, tail: 1.0513, advantages_returns: 0.2488, losses: 3.9021 |
|
bptt: 1.9405 |
|
bptt_forward_core: 1.8546 |
|
update: 11.3090 |
|
clip: 0.8742 |
|
[2025-03-02 05:44:49,799][00415] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.2671, enqueue_policy_requests: 96.1166, env_step: 791.5577, overhead: 11.5509, complete_rollouts: 6.7121 |
|
save_policy_outputs: 18.0269 |
|
split_output_tensors: 6.8828 |
|
[2025-03-02 05:44:49,801][00415] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.2460, enqueue_policy_requests: 94.5889, env_step: 793.5135, overhead: 11.5777, complete_rollouts: 7.2414 |
|
save_policy_outputs: 17.3502 |
|
split_output_tensors: 6.4319 |
|
[2025-03-02 05:44:49,802][00415] Loop Runner_EvtLoop terminating... |
|
[2025-03-02 05:44:49,803][00415] Runner profile tree view: |
|
main_loop: 1032.5267 |
|
[2025-03-02 05:44:49,805][00415] Collected {0: 4005888}, FPS: 3879.7 |
|
[2025-03-02 05:48:54,784][00415] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-02 05:48:54,785][00415] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-02 05:48:54,787][00415] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-02 05:48:54,788][00415] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-02 05:48:54,790][00415] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-02 05:48:54,790][00415] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-02 05:48:54,791][00415] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-02 05:48:54,792][00415] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-02 05:48:54,793][00415] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-02 05:48:54,794][00415] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-02 05:48:54,795][00415] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-02 05:48:54,796][00415] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-02 05:48:54,798][00415] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-02 05:48:54,799][00415] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-02 05:48:54,800][00415] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-02 05:48:54,833][00415] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-02 05:48:54,836][00415] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-02 05:48:54,838][00415] RunningMeanStd input shape: (1,) |
|
[2025-03-02 05:48:54,852][00415] ConvEncoder: input_channels=3 |
|
[2025-03-02 05:48:54,959][00415] Conv encoder output size: 512 |
|
[2025-03-02 05:48:54,960][00415] Policy head output size: 512 |
|
[2025-03-02 05:48:55,138][00415] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-02 05:48:55,883][00415] Num frames 100... |
|
[2025-03-02 05:48:56,009][00415] Num frames 200... |
|
[2025-03-02 05:48:56,139][00415] Num frames 300... |
|
[2025-03-02 05:48:56,268][00415] Num frames 400... |
|
[2025-03-02 05:48:56,401][00415] Num frames 500... |
|
[2025-03-02 05:48:56,531][00415] Num frames 600... |
|
[2025-03-02 05:48:56,665][00415] Num frames 700... |
|
[2025-03-02 05:48:56,800][00415] Num frames 800... |
|
[2025-03-02 05:48:56,934][00415] Num frames 900... |
|
[2025-03-02 05:48:57,064][00415] Num frames 1000... |
|
[2025-03-02 05:48:57,193][00415] Num frames 1100... |
|
[2025-03-02 05:48:57,318][00415] Num frames 1200... |
|
[2025-03-02 05:48:57,444][00415] Num frames 1300... |
|
[2025-03-02 05:48:57,572][00415] Num frames 1400... |
|
[2025-03-02 05:48:57,656][00415] Avg episode rewards: #0: 33.230, true rewards: #0: 14.230 |
|
[2025-03-02 05:48:57,657][00415] Avg episode reward: 33.230, avg true_objective: 14.230 |
|
[2025-03-02 05:48:57,754][00415] Num frames 1500... |
|
[2025-03-02 05:48:57,892][00415] Num frames 1600... |
|
[2025-03-02 05:48:58,019][00415] Num frames 1700... |
|
[2025-03-02 05:48:58,146][00415] Num frames 1800... |
|
[2025-03-02 05:48:58,251][00415] Avg episode rewards: #0: 19.195, true rewards: #0: 9.195 |
|
[2025-03-02 05:48:58,252][00415] Avg episode reward: 19.195, avg true_objective: 9.195 |
|
[2025-03-02 05:48:58,332][00415] Num frames 1900... |
|
[2025-03-02 05:48:58,460][00415] Num frames 2000... |
|
[2025-03-02 05:48:58,586][00415] Num frames 2100... |
|
[2025-03-02 05:48:58,724][00415] Num frames 2200... |
|
[2025-03-02 05:48:58,849][00415] Avg episode rewards: #0: 14.850, true rewards: #0: 7.517 |
|
[2025-03-02 05:48:58,850][00415] Avg episode reward: 14.850, avg true_objective: 7.517 |
|
[2025-03-02 05:48:58,914][00415] Num frames 2300... |
|
[2025-03-02 05:48:59,042][00415] Num frames 2400... |
|
[2025-03-02 05:48:59,170][00415] Num frames 2500... |
|
[2025-03-02 05:48:59,295][00415] Num frames 2600... |
|
[2025-03-02 05:48:59,421][00415] Num frames 2700... |
|
[2025-03-02 05:48:59,548][00415] Num frames 2800... |
|
[2025-03-02 05:48:59,722][00415] Avg episode rewards: #0: 13.738, true rewards: #0: 7.237 |
|
[2025-03-02 05:48:59,723][00415] Avg episode reward: 13.738, avg true_objective: 7.237 |
|
[2025-03-02 05:48:59,732][00415] Num frames 2900... |
|
[2025-03-02 05:48:59,868][00415] Num frames 3000... |
|
[2025-03-02 05:49:00,000][00415] Num frames 3100... |
|
[2025-03-02 05:49:00,127][00415] Num frames 3200... |
|
[2025-03-02 05:49:00,256][00415] Num frames 3300... |
|
[2025-03-02 05:49:00,384][00415] Num frames 3400... |
|
[2025-03-02 05:49:00,511][00415] Num frames 3500... |
|
[2025-03-02 05:49:00,639][00415] Num frames 3600... |
|
[2025-03-02 05:49:00,766][00415] Num frames 3700... |
|
[2025-03-02 05:49:00,907][00415] Num frames 3800... |
|
[2025-03-02 05:49:01,041][00415] Num frames 3900... |
|
[2025-03-02 05:49:01,173][00415] Num frames 4000... |
|
[2025-03-02 05:49:01,301][00415] Num frames 4100... |
|
[2025-03-02 05:49:01,433][00415] Num frames 4200... |
|
[2025-03-02 05:49:01,563][00415] Num frames 4300... |
|
[2025-03-02 05:49:01,693][00415] Num frames 4400... |
|
[2025-03-02 05:49:01,823][00415] Num frames 4500... |
|
[2025-03-02 05:49:01,963][00415] Num frames 4600... |
|
[2025-03-02 05:49:02,095][00415] Num frames 4700... |
|
[2025-03-02 05:49:02,222][00415] Num frames 4800... |
|
[2025-03-02 05:49:02,367][00415] Num frames 4900... |
|
[2025-03-02 05:49:02,520][00415] Avg episode rewards: #0: 21.950, true rewards: #0: 9.950 |
|
[2025-03-02 05:49:02,521][00415] Avg episode reward: 21.950, avg true_objective: 9.950 |
|
[2025-03-02 05:49:02,554][00415] Num frames 5000... |
|
[2025-03-02 05:49:02,680][00415] Num frames 5100... |
|
[2025-03-02 05:49:02,807][00415] Num frames 5200... |
|
[2025-03-02 05:49:02,954][00415] Num frames 5300... |
|
[2025-03-02 05:49:03,081][00415] Num frames 5400... |
|
[2025-03-02 05:49:03,209][00415] Num frames 5500... |
|
[2025-03-02 05:49:03,368][00415] Num frames 5600... |
|
[2025-03-02 05:49:03,497][00415] Num frames 5700... |
|
[2025-03-02 05:49:03,624][00415] Num frames 5800... |
|
[2025-03-02 05:49:03,748][00415] Num frames 5900... |
|
[2025-03-02 05:49:03,934][00415] Avg episode rewards: #0: 21.665, true rewards: #0: 9.998 |
|
[2025-03-02 05:49:03,935][00415] Avg episode reward: 21.665, avg true_objective: 9.998 |
|
[2025-03-02 05:49:03,939][00415] Num frames 6000... |
|
[2025-03-02 05:49:04,075][00415] Num frames 6100... |
|
[2025-03-02 05:49:04,200][00415] Num frames 6200... |
|
[2025-03-02 05:49:04,326][00415] Num frames 6300... |
|
[2025-03-02 05:49:04,456][00415] Num frames 6400... |
|
[2025-03-02 05:49:04,622][00415] Num frames 6500... |
|
[2025-03-02 05:49:04,798][00415] Num frames 6600... |
|
[2025-03-02 05:49:04,985][00415] Num frames 6700... |
|
[2025-03-02 05:49:05,166][00415] Num frames 6800... |
|
[2025-03-02 05:49:05,334][00415] Num frames 6900... |
|
[2025-03-02 05:49:05,500][00415] Num frames 7000... |
|
[2025-03-02 05:49:05,669][00415] Num frames 7100... |
|
[2025-03-02 05:49:05,846][00415] Num frames 7200... |
|
[2025-03-02 05:49:06,030][00415] Num frames 7300... |
|
[2025-03-02 05:49:06,212][00415] Num frames 7400... |
|
[2025-03-02 05:49:06,393][00415] Num frames 7500... |
|
[2025-03-02 05:49:06,574][00415] Num frames 7600... |
|
[2025-03-02 05:49:06,719][00415] Num frames 7700... |
|
[2025-03-02 05:49:06,880][00415] Avg episode rewards: #0: 24.258, true rewards: #0: 11.116 |
|
[2025-03-02 05:49:06,881][00415] Avg episode reward: 24.258, avg true_objective: 11.116 |
|
[2025-03-02 05:49:06,911][00415] Num frames 7800... |
|
[2025-03-02 05:49:07,039][00415] Num frames 7900... |
|
[2025-03-02 05:49:07,175][00415] Num frames 8000... |
|
[2025-03-02 05:49:07,303][00415] Num frames 8100... |
|
[2025-03-02 05:49:07,431][00415] Num frames 8200... |
|
[2025-03-02 05:49:07,557][00415] Num frames 8300... |
|
[2025-03-02 05:49:07,684][00415] Avg episode rewards: #0: 22.696, true rewards: #0: 10.446 |
|
[2025-03-02 05:49:07,685][00415] Avg episode reward: 22.696, avg true_objective: 10.446 |
|
[2025-03-02 05:49:07,740][00415] Num frames 8400... |
|
[2025-03-02 05:49:07,869][00415] Num frames 8500... |
|
[2025-03-02 05:49:08,003][00415] Num frames 8600... |
|
[2025-03-02 05:49:08,138][00415] Num frames 8700... |
|
[2025-03-02 05:49:08,264][00415] Num frames 8800... |
|
[2025-03-02 05:49:08,393][00415] Num frames 8900... |
|
[2025-03-02 05:49:08,521][00415] Num frames 9000... |
|
[2025-03-02 05:49:08,650][00415] Num frames 9100... |
|
[2025-03-02 05:49:08,781][00415] Num frames 9200... |
|
[2025-03-02 05:49:08,908][00415] Avg episode rewards: #0: 22.059, true rewards: #0: 10.281 |
|
[2025-03-02 05:49:08,909][00415] Avg episode reward: 22.059, avg true_objective: 10.281 |
|
[2025-03-02 05:49:08,971][00415] Num frames 9300... |
|
[2025-03-02 05:49:09,098][00415] Num frames 9400... |
|
[2025-03-02 05:49:09,231][00415] Num frames 9500... |
|
[2025-03-02 05:49:09,358][00415] Num frames 9600... |
|
[2025-03-02 05:49:09,486][00415] Num frames 9700... |
|
[2025-03-02 05:49:09,614][00415] Num frames 9800... |
|
[2025-03-02 05:49:09,741][00415] Num frames 9900... |
|
[2025-03-02 05:49:09,876][00415] Num frames 10000... |
|
[2025-03-02 05:49:10,009][00415] Num frames 10100... |
|
[2025-03-02 05:49:10,140][00415] Num frames 10200... |
|
[2025-03-02 05:49:10,273][00415] Num frames 10300... |
|
[2025-03-02 05:49:10,421][00415] Avg episode rewards: #0: 22.573, true rewards: #0: 10.373 |
|
[2025-03-02 05:49:10,422][00415] Avg episode reward: 22.573, avg true_objective: 10.373 |
|
[2025-03-02 05:50:14,376][00415] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2025-03-02 05:50:52,127][00415] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-02 05:50:52,135][00415] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-02 05:50:52,137][00415] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-02 05:50:52,139][00415] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-02 05:50:52,140][00415] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-02 05:50:52,142][00415] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-02 05:50:52,143][00415] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-03-02 05:50:52,145][00415] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-02 05:50:52,146][00415] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-03-02 05:50:52,148][00415] Adding new argument 'hf_repository'='mrinaldi86/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-03-02 05:50:52,149][00415] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-02 05:50:52,151][00415] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-02 05:50:52,153][00415] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-02 05:50:52,155][00415] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-02 05:50:52,164][00415] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-02 05:50:52,213][00415] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-02 05:50:52,215][00415] RunningMeanStd input shape: (1,) |
|
[2025-03-02 05:50:52,235][00415] ConvEncoder: input_channels=3 |
|
[2025-03-02 05:50:52,271][00415] Conv encoder output size: 512 |
|
[2025-03-02 05:50:52,272][00415] Policy head output size: 512 |
|
[2025-03-02 05:50:52,290][00415] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-02 05:50:52,726][00415] Num frames 100... |
|
[2025-03-02 05:50:52,855][00415] Num frames 200... |
|
[2025-03-02 05:50:53,000][00415] Num frames 300... |
|
[2025-03-02 05:50:53,129][00415] Num frames 400... |
|
[2025-03-02 05:50:53,267][00415] Num frames 500... |
|
[2025-03-02 05:50:53,394][00415] Num frames 600... |
|
[2025-03-02 05:50:53,527][00415] Num frames 700... |
|
[2025-03-02 05:50:53,656][00415] Num frames 800... |
|
[2025-03-02 05:50:53,785][00415] Num frames 900... |
|
[2025-03-02 05:50:53,918][00415] Num frames 1000... |
|
[2025-03-02 05:50:54,069][00415] Avg episode rewards: #0: 27.730, true rewards: #0: 10.730 |
|
[2025-03-02 05:50:54,070][00415] Avg episode reward: 27.730, avg true_objective: 10.730 |
|
[2025-03-02 05:50:54,109][00415] Num frames 1100... |
|
[2025-03-02 05:50:54,237][00415] Num frames 1200... |
|
[2025-03-02 05:50:54,370][00415] Num frames 1300... |
|
[2025-03-02 05:50:54,495][00415] Num frames 1400... |
|
[2025-03-02 05:50:54,621][00415] Num frames 1500... |
|
[2025-03-02 05:50:54,746][00415] Num frames 1600... |
|
[2025-03-02 05:50:54,887][00415] Num frames 1700... |
|
[2025-03-02 05:50:55,023][00415] Avg episode rewards: #0: 20.225, true rewards: #0: 8.725 |
|
[2025-03-02 05:50:55,024][00415] Avg episode reward: 20.225, avg true_objective: 8.725 |
|
[2025-03-02 05:50:55,130][00415] Num frames 1800... |
|
[2025-03-02 05:50:55,298][00415] Num frames 1900... |
|
[2025-03-02 05:50:55,476][00415] Num frames 2000... |
|
[2025-03-02 05:50:55,649][00415] Num frames 2100... |
|
[2025-03-02 05:50:55,819][00415] Num frames 2200... |
|
[2025-03-02 05:50:55,995][00415] Num frames 2300... |
|
[2025-03-02 05:50:56,166][00415] Num frames 2400... |
|
[2025-03-02 05:50:56,369][00415] Avg episode rewards: #0: 18.940, true rewards: #0: 8.273 |
|
[2025-03-02 05:50:56,371][00415] Avg episode reward: 18.940, avg true_objective: 8.273 |
|
[2025-03-02 05:50:56,406][00415] Num frames 2500... |
|
[2025-03-02 05:50:56,583][00415] Num frames 2600... |
|
[2025-03-02 05:50:56,763][00415] Num frames 2700... |
|
[2025-03-02 05:50:56,957][00415] Num frames 2800... |
|
[2025-03-02 05:50:57,128][00415] Num frames 2900... |
|
[2025-03-02 05:50:57,260][00415] Num frames 3000... |
|
[2025-03-02 05:50:57,398][00415] Num frames 3100... |
|
[2025-03-02 05:50:57,502][00415] Avg episode rewards: #0: 17.345, true rewards: #0: 7.845 |
|
[2025-03-02 05:50:57,503][00415] Avg episode reward: 17.345, avg true_objective: 7.845 |
|
[2025-03-02 05:50:57,582][00415] Num frames 3200... |
|
[2025-03-02 05:50:57,708][00415] Num frames 3300... |
|
[2025-03-02 05:50:57,842][00415] Num frames 3400... |
|
[2025-03-02 05:50:57,976][00415] Num frames 3500... |
|
[2025-03-02 05:50:58,108][00415] Num frames 3600... |
|
[2025-03-02 05:50:58,237][00415] Num frames 3700... |
|
[2025-03-02 05:50:58,365][00415] Num frames 3800... |
|
[2025-03-02 05:50:58,500][00415] Num frames 3900... |
|
[2025-03-02 05:50:58,627][00415] Num frames 4000... |
|
[2025-03-02 05:50:58,760][00415] Num frames 4100... |
|
[2025-03-02 05:50:58,892][00415] Num frames 4200... |
|
[2025-03-02 05:50:59,021][00415] Num frames 4300... |
|
[2025-03-02 05:50:59,189][00415] Avg episode rewards: #0: 19.972, true rewards: #0: 8.772 |
|
[2025-03-02 05:50:59,190][00415] Avg episode reward: 19.972, avg true_objective: 8.772 |
|
[2025-03-02 05:50:59,211][00415] Num frames 4400... |
|
[2025-03-02 05:50:59,339][00415] Num frames 4500... |
|
[2025-03-02 05:50:59,475][00415] Num frames 4600... |
|
[2025-03-02 05:50:59,604][00415] Num frames 4700... |
|
[2025-03-02 05:50:59,731][00415] Num frames 4800... |
|
[2025-03-02 05:50:59,865][00415] Num frames 4900... |
|
[2025-03-02 05:51:00,001][00415] Num frames 5000... |
|
[2025-03-02 05:51:00,136][00415] Num frames 5100... |
|
[2025-03-02 05:51:00,261][00415] Num frames 5200... |
|
[2025-03-02 05:51:00,388][00415] Num frames 5300... |
|
[2025-03-02 05:51:00,523][00415] Num frames 5400... |
|
[2025-03-02 05:51:00,643][00415] Avg episode rewards: #0: 20.582, true rewards: #0: 9.082 |
|
[2025-03-02 05:51:00,644][00415] Avg episode reward: 20.582, avg true_objective: 9.082 |
|
[2025-03-02 05:51:00,711][00415] Num frames 5500... |
|
[2025-03-02 05:51:00,844][00415] Num frames 5600... |
|
[2025-03-02 05:51:00,977][00415] Num frames 5700... |
|
[2025-03-02 05:51:01,104][00415] Num frames 5800... |
|
[2025-03-02 05:51:01,232][00415] Num frames 5900... |
|
[2025-03-02 05:51:01,351][00415] Avg episode rewards: #0: 18.643, true rewards: #0: 8.500 |
|
[2025-03-02 05:51:01,352][00415] Avg episode reward: 18.643, avg true_objective: 8.500 |
|
[2025-03-02 05:51:01,417][00415] Num frames 6000... |
|
[2025-03-02 05:51:01,553][00415] Num frames 6100... |
|
[2025-03-02 05:51:01,689][00415] Num frames 6200... |
|
[2025-03-02 05:51:01,818][00415] Num frames 6300... |
|
[2025-03-02 05:51:01,952][00415] Num frames 6400... |
|
[2025-03-02 05:51:02,083][00415] Num frames 6500... |
|
[2025-03-02 05:51:02,213][00415] Num frames 6600... |
|
[2025-03-02 05:51:02,344][00415] Num frames 6700... |
|
[2025-03-02 05:51:02,473][00415] Num frames 6800... |
|
[2025-03-02 05:51:02,609][00415] Num frames 6900... |
|
[2025-03-02 05:51:02,737][00415] Num frames 7000... |
|
[2025-03-02 05:51:02,869][00415] Num frames 7100... |
|
[2025-03-02 05:51:03,003][00415] Num frames 7200... |
|
[2025-03-02 05:51:03,134][00415] Num frames 7300... |
|
[2025-03-02 05:51:03,263][00415] Num frames 7400... |
|
[2025-03-02 05:51:03,390][00415] Num frames 7500... |
|
[2025-03-02 05:51:03,529][00415] Num frames 7600... |
|
[2025-03-02 05:51:03,657][00415] Num frames 7700... |
|
[2025-03-02 05:51:03,803][00415] Num frames 7800... |
|
[2025-03-02 05:51:03,950][00415] Num frames 7900... |
|
[2025-03-02 05:51:04,050][00415] Avg episode rewards: #0: 22.543, true rewards: #0: 9.917 |
|
[2025-03-02 05:51:04,051][00415] Avg episode reward: 22.543, avg true_objective: 9.917 |
|
[2025-03-02 05:51:04,139][00415] Num frames 8000... |
|
[2025-03-02 05:51:04,270][00415] Num frames 8100... |
|
[2025-03-02 05:51:04,398][00415] Num frames 8200... |
|
[2025-03-02 05:51:04,532][00415] Num frames 8300... |
|
[2025-03-02 05:51:04,668][00415] Num frames 8400... |
|
[2025-03-02 05:51:04,799][00415] Num frames 8500... |
|
[2025-03-02 05:51:04,932][00415] Num frames 8600... |
|
[2025-03-02 05:51:05,064][00415] Num frames 8700... |
|
[2025-03-02 05:51:05,193][00415] Num frames 8800... |
|
[2025-03-02 05:51:05,320][00415] Num frames 8900... |
|
[2025-03-02 05:51:05,462][00415] Num frames 9000... |
|
[2025-03-02 05:51:05,600][00415] Num frames 9100... |
|
[2025-03-02 05:51:05,729][00415] Num frames 9200... |
|
[2025-03-02 05:51:05,864][00415] Num frames 9300... |
|
[2025-03-02 05:51:06,006][00415] Num frames 9400... |
|
[2025-03-02 05:51:06,137][00415] Num frames 9500... |
|
[2025-03-02 05:51:06,266][00415] Num frames 9600... |
|
[2025-03-02 05:51:06,397][00415] Num frames 9700... |
|
[2025-03-02 05:51:06,529][00415] Num frames 9800... |
|
[2025-03-02 05:51:06,666][00415] Num frames 9900... |
|
[2025-03-02 05:51:06,799][00415] Num frames 10000... |
|
[2025-03-02 05:51:06,899][00415] Avg episode rewards: #0: 26.815, true rewards: #0: 11.149 |
|
[2025-03-02 05:51:06,900][00415] Avg episode reward: 26.815, avg true_objective: 11.149 |
|
[2025-03-02 05:51:06,992][00415] Num frames 10100... |
|
[2025-03-02 05:51:07,139][00415] Num frames 10200... |
|
[2025-03-02 05:51:07,315][00415] Num frames 10300... |
|
[2025-03-02 05:51:07,493][00415] Num frames 10400... |
|
[2025-03-02 05:51:07,681][00415] Num frames 10500... |
|
[2025-03-02 05:51:07,854][00415] Num frames 10600... |
|
[2025-03-02 05:51:08,045][00415] Num frames 10700... |
|
[2025-03-02 05:51:08,216][00415] Num frames 10800... |
|
[2025-03-02 05:51:08,384][00415] Num frames 10900... |
|
[2025-03-02 05:51:08,568][00415] Num frames 11000... |
|
[2025-03-02 05:51:08,753][00415] Num frames 11100... |
|
[2025-03-02 05:51:08,941][00415] Num frames 11200... |
|
[2025-03-02 05:51:09,124][00415] Num frames 11300... |
|
[2025-03-02 05:51:09,305][00415] Num frames 11400... |
|
[2025-03-02 05:51:09,444][00415] Num frames 11500... |
|
[2025-03-02 05:51:09,574][00415] Num frames 11600... |
|
[2025-03-02 05:51:09,713][00415] Num frames 11700... |
|
[2025-03-02 05:51:09,892][00415] Avg episode rewards: #0: 29.395, true rewards: #0: 11.795 |
|
[2025-03-02 05:51:09,893][00415] Avg episode reward: 29.395, avg true_objective: 11.795 |
|
[2025-03-02 05:52:21,556][00415] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
|