LoRA Fine-tuning for DBT Model
This repository contains code for fine-tuning a language model using Low-Rank Adaptation (LoRA) on Dialectical Behavior Therapy (DBT) content. The model is based on SmolLM2-360M-Instruct and fine-tuned specifically on Marsha M. Linehan's DBT® Skills Training Manual, Second Edition (The Guilford Press, 2014).
Table of Contents
- Overview
- Requirements
- Project Structure
- Setup
- Training Process
- Running Inference
- Configuration Parameters
- License
Overview
This project fine-tunes a language model specifically for understanding and generating content related to Dialectical Behavior Therapy (DBT). The model is trained on Marsha M. Linehan's authoritative DBT® Skills Training Manual (Second Edition), which is the definitive resource for DBT practitioners. By using LoRA (Low-Rank Adaptation), we can efficiently adapt pre-trained language models to specific domains without the need to retrain the entire model. This approach significantly reduces computational requirements while maintaining performance.
Requirements
- Python 3.8+
- PyTorch
- Transformers
- PEFT (Parameter-Efficient Fine-Tuning)
- Datasets
- Hugging Face Hub account
- Weights & Biases account (for tracking experiments)
pip install torch transformers peft datasets huggingface_hub wandb langchain
Project Structure
.
├── README.md
├── fine_tune_dbt.py # Main training script
├── linehan_guide.md # Training data (chunked sections of Linehan's DBT® Skills Training Manual)
└── lora-DBT-model/ # Output directory for model
Setup
- Clone this repository:
git clone https://github.com/yourusername/lora-dbt-model.git
cd lora-dbt-model
- Set up your Hugging Face and Weights & Biases tokens:
- Replace
'hftoken'with your actual Hugging Face token - Replace
'wandbkey'with your actual Weights & Biases key
- Replace
Training Process
The training process includes the following steps:
- Loading the base model (SmolLM2-360M-Instruct)
- Loading and preprocessing the text data
- Tokenizing the data
- Configuring LoRA parameters
- Training the model
- Saving the LoRA adapters
To run the training:
python fine_tune_dbt.py
The script will automatically push the model to the Hugging Face Hub under the specified repository name.
Running Inference
After training, you can run inference using the Hugging Face pipeline. Here's how to use your fine-tuned model for chat:
from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig
# Load the fine-tuned model
model_name = "zayzay58/lora-DBT-model" # Replace with your actual model path or HF Hub ID
# Option 1: Load from Hugging Face Hub
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base_model, model_name)
# Create a chat pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
max_length=512,
temperature=0.7,
top_p=0.95,
repetition_penalty=1.15
)
# Format your input for chat - format depends on the base model you used
# For SmolLM2-360M-Instruct, the typical format is:
chat_input = "<|system|>You are a helpful assistant trained in DBT techniques.<|endoftext|><|user|>What are some DBT skills for emotion regulation?<|endoftext|><|assistant|>"
# Generate response
response = pipe(chat_input)[0]['generated_text']
# Extract just the assistant's response
assistant_response = response.split("<|assistant|>")[-1].strip()
print(assistant_response)
Chat Template
For interactive chat applications, you can use this template:
def chat_with_dbt_model(model, tokenizer, conversation_history=None):
if conversation_history is None:
conversation_history = [
{"role": "system", "content": "You are a helpful assistant trained in DBT techniques."}
]
formatted_prompt = ""
for message in conversation_history:
role = message["role"]
content = message["content"]
if role == "system":
formatted_prompt += f"<|system|>{content}<|endoftext|>"
elif role == "user":
formatted_prompt += f"<|user|>{content}<|endoftext|>"
elif role == "assistant":
formatted_prompt += f"<|assistant|>{content}<|endoftext|>"
# Add the final assistant token to get the model to generate a response
if conversation_history[-1]["role"] != "assistant":
formatted_prompt += "<|assistant|>"
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
# Generate response
output = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.95,
repetition_penalty=1.15,
do_sample=True
)
response = tokenizer.decode(output[0], skip_special_tokens=False)
assistant_response = response.split("<|assistant|>")[-1].split("<|endoftext|>")[0].strip()
return assistant_response
# Example usage
conversation = [
{"role": "system", "content": "You are a helpful assistant trained in DBT techniques."},
{"role": "user", "content": "What are some DBT skills for emotion regulation?"}
]
response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)
# Continue the conversation
conversation.append({"role": "assistant", "content": response})
conversation.append({"role": "user", "content": "Can you explain the TIPP skill in more detail?"})
response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)
Configuration Parameters
You can customize the following parameters in the script:
MODEL_NAME: Base model to fine-tuneDATA_PATH: Path to your training data fileOUTPUT_DIR: Directory to save the modelHF_REPO_NAME: Repository name on Hugging Face HubLORA_R: Rank of the LoRA matricesLORA_ALPHA: Scaling factor for LoRA layersLORA_DROPOUT: Dropout probability for LoRA layersBATCH_SIZE: Batch size for trainingGRADIENT_ACCUMULATION_STEPS: Number of steps to accumulate gradientsLEARNING_RATE: Learning rate for trainingNUM_EPOCHS: Number of training epochsMAX_SEQ_LENGTH: Maximum sequence length for training
License
This project is licensed under the MIT License - see the LICENSE file for details.
Model tree for zayzay58/lora-DBT-model
Base model
HuggingFaceTB/SmolLM2-360M