dataset300-parameter-wise-llm-adamerge-shannonentropy-mistral-7b-instrcut-math-code
/
logs
/save_merged_model_20250616_121108.log
2025-06-16 12:11:08 - experiment_save_merged_model - INFO - Starting merged model save process | |
2025-06-16 12:11:08 - experiment_save_merged_model - INFO - Arguments: {'lambdas_path': '/work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/shannon-entropy-loss/llm_adamerge_parameterwise_lambdas.json', 'model_config': '/work/gj26/b20042/LLM-AdaMerge/src/configs/model_config.yaml', 'output_dir': '/work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-shannonentropy', 'model_name': 'merged-model', 'push_to_hub': False, 'hub_repo_id': 'lejelly/parameter-wise-llm-adamerge-shannonentropy-mistral-7b-instrcut-math-code', 'private': False, 'device': 'cuda', 'debug': False} | |
2025-06-16 12:11:08 - experiment_save_merged_model - INFO - Loading lambdas from /work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/shannon-entropy-loss/llm_adamerge_parameterwise_lambdas.json | |
2025-06-16 12:11:08 - experiment_save_merged_model - INFO - Auto-detected parameter-wise merge from JSON structure | |
2025-06-16 12:11:08 - experiment_save_merged_model - INFO - Merge type: parameter_wise | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - [Initial] Memory Usage: | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - Process: 0.37 GB (0.2%) | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - System: 9.64 GB / 212.52 GB (9.2%) | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - Available: 193.06 GB | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:11:09 - experiment_save_merged_model - INFO - Loading models | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - [After loading models] Memory Usage: | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - Process: 41.58 GB (19.6%) | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - System: 49.90 GB / 212.52 GB (31.4%) | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - Available: 145.79 GB | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:11:27 - experiment_save_merged_model - INFO - Initializing parameter_wise AdaMerge | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Loading learned lambdas | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Deleting original models to free memory (task vectors already computed) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - [Before deleting models] Memory Usage: | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Process: 95.79 GB (45.1%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - System: 90.65 GB / 212.52 GB (50.6%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Available: 104.98 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Clearing model_loader references | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Deleting model variables | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Running garbage collection | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - [After deleting models and GC] Memory Usage: | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Process: 56.44 GB (26.6%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - System: 65.00 GB / 212.52 GB (38.5%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Available: 130.63 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - [After loading lambdas] Memory Usage: | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Process: 56.44 GB (26.6%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - System: 65.00 GB / 212.52 GB (38.5%) | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Available: 130.63 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Creating merged model with learned lambdas | |
2025-06-16 12:12:42 - experiment_save_merged_model - INFO - Using merge_models_for_save() for parameter-wise merge | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - [After merging models] Memory Usage: | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Process: 58.21 GB (27.4%) | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - System: 93.88 GB / 212.52 GB (48.9%) | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Available: 108.62 GB | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 27.23 GB, Total: 94.50 GB | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Freeing memory from AdaMerge object (task vectors and base params no longer needed) | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Deleting task vectors | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Deleting base params | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Deleting functional model | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - [After freeing AdaMerge memory] Memory Usage: | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Process: 6.21 GB (2.9%) | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - System: 27.91 GB / 212.52 GB (17.8%) | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Available: 174.60 GB | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 13.62 GB, Total: 94.50 GB | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Saving merged model to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-shannonentropy | |
2025-06-16 12:14:36 - experiment_save_merged_model - INFO - Moving parameter-wise merged model to CPU for saving | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Successfully saved 3 safetensors files: | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - - model-00001-of-00003.safetensors (4714.17 MB) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - - model-00003-of-00003.safetensors (4330.17 MB) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - - model-00002-of-00003.safetensors (4768.20 MB) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - [After saving model] Memory Usage: | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Process: 15.27 GB (7.2%) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - System: 23.62 GB / 212.52 GB (19.0%) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Available: 172.13 GB | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Saving tokenizer | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Copied lambdas file to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-shannonentropy/learned_lambdas.json | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Creating model card | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Cleaning up models | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - [After cleanup] Memory Usage: | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Process: 6.43 GB (3.0%) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - System: 14.80 GB / 212.52 GB (14.9%) | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Available: 180.95 GB | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB | |
2025-06-16 12:15:09 - experiment_save_merged_model - INFO - Model saved successfully to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-shannonentropy | |