lejelly's picture
Upload folder using huggingface_hub
a975df7 verified
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Starting merged model save process
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Arguments: {'lambdas_path': '/work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/cross-entropy-loss/llm_adamerge_parameterwise_lambdas.json', 'model_config': '/work/gj26/b20042/LLM-AdaMerge/src/configs/model_config.yaml', 'output_dir': '/work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-crossentropy', 'model_name': 'merged-model', 'push_to_hub': False, 'hub_repo_id': 'lejelly/parameter-wise-llm-adamerge-crossentropy-mistral-7b-instrcut-math-code', 'private': False, 'device': 'cuda', 'debug': False}
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Loading lambdas from /work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/cross-entropy-loss/llm_adamerge_parameterwise_lambdas.json
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Auto-detected parameter-wise merge from JSON structure
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Merge type: parameter_wise
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - [Initial] Memory Usage:
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Process: 0.38 GB (0.2%)
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - System: 9.52 GB / 212.52 GB (9.1%)
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Available: 193.16 GB
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:10:07 - experiment_save_merged_model - INFO - Loading models
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - [After loading models] Memory Usage:
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - Process: 40.60 GB (19.1%)
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - System: 48.75 GB / 212.52 GB (30.9%)
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - Available: 146.93 GB
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:10:25 - experiment_save_merged_model - INFO - Initializing parameter_wise AdaMerge
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Loading learned lambdas
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Deleting original models to free memory (task vectors already computed)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - [Before deleting models] Memory Usage:
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Process: 94.70 GB (44.6%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - System: 89.79 GB / 212.52 GB (50.2%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Available: 105.82 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Clearing model_loader references
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Deleting model variables
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Running garbage collection
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - [After deleting models and GC] Memory Usage:
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Process: 55.38 GB (26.1%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - System: 64.31 GB / 212.52 GB (38.2%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Available: 131.30 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - [After loading lambdas] Memory Usage:
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Process: 55.38 GB (26.1%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - System: 64.31 GB / 212.52 GB (38.2%)
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Available: 131.30 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Creating merged model with learned lambdas
2025-06-16 12:11:40 - experiment_save_merged_model - INFO - Using merge_models_for_save() for parameter-wise merge
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - [After merging models] Memory Usage:
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Process: 57.71 GB (27.2%)
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - System: 93.30 GB / 212.52 GB (48.7%)
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Available: 109.02 GB
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 27.23 GB, Total: 94.50 GB
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Freeing memory from AdaMerge object (task vectors and base params no longer needed)
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Deleting task vectors
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Deleting base params
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Deleting functional model
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - [After freeing AdaMerge memory] Memory Usage:
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Process: 5.72 GB (2.7%)
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - System: 27.36 GB / 212.52 GB (17.7%)
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Available: 174.96 GB
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 13.62 GB, Total: 94.50 GB
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Saving merged model to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-crossentropy
2025-06-16 12:13:34 - experiment_save_merged_model - INFO - Moving parameter-wise merged model to CPU for saving
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Successfully saved 3 safetensors files:
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - - model-00001-of-00003.safetensors (4714.17 MB)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - - model-00003-of-00003.safetensors (4330.17 MB)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - - model-00002-of-00003.safetensors (4768.20 MB)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - [After saving model] Memory Usage:
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Process: 15.98 GB (7.5%)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - System: 24.02 GB / 212.52 GB (19.3%)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Available: 171.60 GB
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Saving tokenizer
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Copied lambdas file to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-crossentropy/learned_lambdas.json
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Creating model card
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Cleaning up models
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - [After cleanup] Memory Usage:
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Process: 4.67 GB (2.2%)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - System: 12.73 GB / 212.52 GB (13.9%)
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Available: 182.89 GB
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
2025-06-16 12:14:12 - experiment_save_merged_model - INFO - Model saved successfully to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/parameter-wise-crossentropy