t83714
/

llama-3.1-8b-instruct-limo-lora-adapter

@@ -1,26 +1,88 @@
 ---
 library_name: peft
-license: other
 base_model: meta-llama/Llama-3.1-8B-Instruct
 tags:
 - llama-factory
 - lora
 - generated_from_trainer
 model-index:
-- name: lora
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# lora
-This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) on the limo dataset.
 ## Model description
-More information needed
 ## Intended uses & limitations
@@ -44,14 +106,27 @@ The following hyperparameters were used during training:
 - lr_scheduler_type: cosine
 - num_epochs: 15
-### Training results
 ### Framework versions
 - PEFT 0.12.0
 - Transformers 4.49.0
 - Pytorch 2.6.0+cu124
 - Datasets 3.3.2
-- Tokenizers 0.21.0

 ---
 library_name: peft
+license: apache-2.0
+pipeline_tag: text-generation
 base_model: meta-llama/Llama-3.1-8B-Instruct
+datasets:
+- GAIR/LIMO
 tags:
 - llama-factory
 - lora
 - generated_from_trainer
+- chat
+- Llama-3
+- instruct
+- finetune
 model-index:
+- name: llama-3.1-8b-instruct-limo-lora
   results: []
 ---
+# llama-3.1-8b-instruct-limo-lora
+This model is a fine-tuned version of [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model. The fine-tuning was performed using Low-Rank Adaptation (LoRA) on the [LIMO dataset](https://huggingface.co/datasets/GAIR/LIMO) to enhance the model's reasoning capabilities, based on the work in the paper: [LIMO: Less is More for Reasoning](https://arxiv.org/pdf/2502.03387).
+This repo contains the LoRA adapter weights only. The merged model can be found from [here](https://huggingface.co/t83714/llama-3.1-8b-instruct-limo).
 ## Model description
+- **Base Model**: [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
+- **Fine-Tuning Dataset**: [GAIR/LIMO](https://huggingface.co/datasets/GAIR/LIMO)
+- **Fine-Tuning Method**: Low-Rank Adaptation (LoRA)
+- **Library Used**: [peft](https://github.com/huggingface/peft)
+- **License**: [Apache 2.0](LICENSE)
+## Usage
+To utilize this model for text generation tasks, follow the steps below:
+### Installation
+Ensure you have the necessary libraries installed:
+```bash
+pip install torch transformers peft
+```
+### Generating Text
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+# Load the base model
+base_model_name = "meta-llama/Llama-3.1-8B-Instruct"
+base_model = AutoModelForCausalLM.from_pretrained(base_model_name, torch_dtype="auto", device_map="auto")
+# Load the tokenizer
+tokenizer = AutoTokenizer.from_pretrained(base_model_name)
+# Load the LoRA adapter
+adapter_path = "path_to_your_lora_adapter"
+model = PeftModel.from_pretrained(base_model, adapter_path)
+prompt = "Hello"
+# Tokenize the input
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+# Generate the output
+output = merged_model.generate(**inputs, max_length=200)
+print(tokenizer.decode(output[0], skip_special_tokens=True))
+```
+### Merge the adapter and export merged model
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM
+base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
+model = PeftModel.from_pretrained(base_model, "./")
+merged_model = model.merge_and_unload()
+merged_model.save_pretrained("./merged-model/")
+```
 ## Intended uses & limitations
 - lr_scheduler_type: cosine
 - num_epochs: 15
 ### Framework versions
 - PEFT 0.12.0
 - Transformers 4.49.0
 - Pytorch 2.6.0+cu124
 - Datasets 3.3.2
+- Tokenizers 0.21.0
+## Acknowledgment
+This model is trained based on the work of [Ye et al. (2025)](https://arxiv.org/abs/2502.03387). If you use this model, please also consider citing their paper:
+```bibtex
+@misc{ye2025limoreasoning,
+      title={LIMO: Less is More for Reasoning},
+      author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
+      year={2025},
+      eprint={2502.03387},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2502.03387},
+}
+```