co-gy
/

Qwen2.5-0.5B-DPO

Model card Files Files and versions

co-gy commited on Nov 15, 2024

Commit

e7b99b0

·

verified ·

1 Parent(s): 5923e94

Update README.md

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
----
-license: apache-2.0
----

+---
+datasets:
+- Intel/orca_dpo_pairs
+base_model:
+- Qwen/Qwen2.5-0.5B-Instruct
+license: apache-2.0
+---
+# Fine-tuned Qwen/Qwen2.5-0.5B-Instruct Model
+## Model Overview
+This is a fine-tuned version of the Qwen/Qwen2.5-0.5B-Instruct model. The fine-tuning process utilized the Intel/orca_dpo_pairs dataset and applied DPO (Direct Preference Optimization) and LoRA (Low-Rank Adaptation) techniques.
+**Note**: This fine-tuning was done following the instructions in [this blog](https://towardsdatascience.com/fine-tune-a-mistral-7b-model-with-direct-preference-optimization-708042745aac).
+## Fine-tuning Details
+- **Base Model**: Qwen/Qwen2.5-0.5B-Instruct
+- **Dataset**: Intel/orca_dpo_pairs
+- **Fine-tuning Method**: DPO + LoRA
+## Usage Instructions
+### Install Dependencies
+Before using this model, make sure you have the following dependencies installed:
+```bash
+pip install transformers datasets
+```
+### Load the model
+```python
+import transformers
+from transformers import AutoConfig, AutoModel, AutoTokenizer
+tokenizer = AutoTokenizer.from_pretrained("drive/MyDrive/result/Qwen-DPO")
+message = [
+    {"role": "system", "content": "You are a helpful assistant chatbot."},
+    {"role": "user", "content": "What is a Large Language Model?"}
+]
+prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
+pipeline = transformers.pipeline(
+    "text-generation",
+    model="co-gy/Qwen-DPO",
+    tokenizer=tokenizer
+)
+sequences = pipeline(
+    prompt,
+    do_sample=True,
+    temperature=0.7,
+    top_p=0.9,
+    num_return_sequences=1,
+    max_length=200,
+)
+print(sequences[0]['generated_text'])
+```