|
--- |
|
tags: |
|
- merge |
|
- parameter_wise |
|
- llm-adamerge |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
--- |
|
|
|
# Merged Model using LLM-AdaMerge (parameter_wise) |
|
|
|
This model was created by merging multiple fine-tuned models using the LLM-AdaMerge approach with parameter_wise merging. |
|
|
|
## Merge Details |
|
|
|
- **Merge Type**: parameter_wise |
|
- **Base Model**: mistralai/Mistral-7B-v0.1 |
|
- **Number of Models Merged**: 3 |
|
- **Models Merged**: instruct, math, code |
|
- **Final Training Loss**: N/A |
|
- **Training Epochs**: 0 |
|
|
|
## Lambda Coefficients |
|
|
|
The following lambda coefficients were learned during training: |
|
|
|
|
|
### Parameter-wise Lambdas |
|
This model uses parameter-wise lambda coefficients. Total parameters with individual lambdas: 291 |
|
|
|
|
|
See the uploaded `learned_lambdas.json` file for detailed parameter-wise coefficients. |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("your-username/model-name") |
|
tokenizer = AutoTokenizer.from_pretrained("your-username/model-name") |
|
|
|
# Use the model |
|
inputs = tokenizer("Hello, how are you?", return_tensors="pt") |
|
outputs = model.generate(**inputs) |
|
print(tokenizer.decode(outputs[0])) |
|
``` |
|
|
|
## Training Configuration |
|
|
|
See the uploaded `training_config.json` file for detailed training configuration. |
|
|
|
## Citation |
|
|
|
If you use this model, please cite the LLM-AdaMerge paper: |
|
|
|
```bibtex |
|
@article{llmadamerge2024, |
|
title={LLM-AdaMerge: Adaptive Model Merging for Large Language Models}, |
|
author={...}, |
|
year={2024} |
|
} |
|
``` |
|
|