lejelly
/

dataset300-parameter-wise-llm-adamerge-shannonentropy-mistral-7b-instrcut-math-code

Model card Files Files and versions Community

dataset300-parameter-wise-llm-adamerge-shannonentropy-mistral-7b-instrcut-math-code / README.md

lejelly's picture

Upload folder using huggingface_hub

05acf08 verified 2 months ago

|

history blame contribute delete

1.51 kB

	---
	tags:
	- merge
	- parameter_wise
	- llm-adamerge
	base_model: mistralai/Mistral-7B-v0.1
	---

	# Merged Model using LLM-AdaMerge (parameter_wise)

	This model was created by merging multiple fine-tuned models using the LLM-AdaMerge approach with parameter_wise merging.

	## Merge Details

	- Merge Type: parameter_wise
	- Base Model: mistralai/Mistral-7B-v0.1
	- Number of Models Merged: 3
	- Models Merged: instruct, math, code
	- Final Training Loss: N/A
	- Training Epochs: 0

	## Lambda Coefficients

	The following lambda coefficients were learned during training:


	### Parameter-wise Lambdas
	This model uses parameter-wise lambda coefficients. Total parameters with individual lambdas: 291


	See the uploaded `learned_lambdas.json` file for detailed parameter-wise coefficients.

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("your-username/model-name")
	tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")

	# Use the model
	inputs = tokenizer("Hello, how are you?", return_tensors="pt")
	outputs = model.generate(**inputs)
	print(tokenizer.decode(outputs[0]))
	```

	## Training Configuration

	See the uploaded `training_config.json` file for detailed training configuration.

	## Citation

	If you use this model, please cite the LLM-AdaMerge paper:

	```bibtex
	@article{llmadamerge2024,
	title={LLM-AdaMerge: Adaptive Model Merging for Large Language Models},
	author={...},
	year={2024}
	}
	```