LoRA Fine-tuning for DBT Model

This repository contains code for fine-tuning a language model using Low-Rank Adaptation (LoRA) on Dialectical Behavior Therapy (DBT) content. The model is based on SmolLM2-360M-Instruct and fine-tuned specifically on Marsha M. Linehan's DBT® Skills Training Manual, Second Edition (The Guilford Press, 2014).

Table of Contents

Overview

This project fine-tunes a language model specifically for understanding and generating content related to Dialectical Behavior Therapy (DBT). The model is trained on Marsha M. Linehan's authoritative DBT® Skills Training Manual (Second Edition), which is the definitive resource for DBT practitioners. By using LoRA (Low-Rank Adaptation), we can efficiently adapt pre-trained language models to specific domains without the need to retrain the entire model. This approach significantly reduces computational requirements while maintaining performance.

Requirements

  • Python 3.8+
  • PyTorch
  • Transformers
  • PEFT (Parameter-Efficient Fine-Tuning)
  • Datasets
  • Hugging Face Hub account
  • Weights & Biases account (for tracking experiments)
pip install torch transformers peft datasets huggingface_hub wandb langchain

Project Structure

.
├── README.md
├── fine_tune_dbt.py           # Main training script
├── linehan_guide.md           # Training data (chunked sections of Linehan's DBT® Skills Training Manual)
└── lora-DBT-model/            # Output directory for model

Setup

  1. Clone this repository:
git clone https://github.com/yourusername/lora-dbt-model.git
cd lora-dbt-model
  1. Set up your Hugging Face and Weights & Biases tokens:
    • Replace 'hftoken' with your actual Hugging Face token
    • Replace 'wandbkey' with your actual Weights & Biases key

Training Process

The training process includes the following steps:

  1. Loading the base model (SmolLM2-360M-Instruct)
  2. Loading and preprocessing the text data
  3. Tokenizing the data
  4. Configuring LoRA parameters
  5. Training the model
  6. Saving the LoRA adapters

To run the training:

python fine_tune_dbt.py

The script will automatically push the model to the Hugging Face Hub under the specified repository name.

Running Inference

After training, you can run inference using the Hugging Face pipeline. Here's how to use your fine-tuned model for chat:

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

# Load the fine-tuned model
model_name = "zayzay58/lora-DBT-model"  # Replace with your actual model path or HF Hub ID

# Option 1: Load from Hugging Face Hub
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base_model, model_name)

# Create a chat pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=512,
    temperature=0.7,
    top_p=0.95,
    repetition_penalty=1.15
)

# Format your input for chat - format depends on the base model you used
# For SmolLM2-360M-Instruct, the typical format is:
chat_input = "<|system|>You are a helpful assistant trained in DBT techniques.<|endoftext|><|user|>What are some DBT skills for emotion regulation?<|endoftext|><|assistant|>"

# Generate response
response = pipe(chat_input)[0]['generated_text']

# Extract just the assistant's response
assistant_response = response.split("<|assistant|>")[-1].strip()
print(assistant_response)

Chat Template

For interactive chat applications, you can use this template:

def chat_with_dbt_model(model, tokenizer, conversation_history=None):
    if conversation_history is None:
        conversation_history = [
            {"role": "system", "content": "You are a helpful assistant trained in DBT techniques."}
        ]
    
    formatted_prompt = ""
    for message in conversation_history:
        role = message["role"]
        content = message["content"]
        if role == "system":
            formatted_prompt += f"<|system|>{content}<|endoftext|>"
        elif role == "user":
            formatted_prompt += f"<|user|>{content}<|endoftext|>"
        elif role == "assistant":
            formatted_prompt += f"<|assistant|>{content}<|endoftext|>"
    
    # Add the final assistant token to get the model to generate a response
    if conversation_history[-1]["role"] != "assistant":
        formatted_prompt += "<|assistant|>"
    
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
    
    # Generate response
    output = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.95,
        repetition_penalty=1.15,
        do_sample=True
    )
    
    response = tokenizer.decode(output[0], skip_special_tokens=False)
    assistant_response = response.split("<|assistant|>")[-1].split("<|endoftext|>")[0].strip()
    
    return assistant_response

# Example usage
conversation = [
    {"role": "system", "content": "You are a helpful assistant trained in DBT techniques."},
    {"role": "user", "content": "What are some DBT skills for emotion regulation?"}
]

response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)

# Continue the conversation
conversation.append({"role": "assistant", "content": response})
conversation.append({"role": "user", "content": "Can you explain the TIPP skill in more detail?"})

response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)

Configuration Parameters

You can customize the following parameters in the script:

  • MODEL_NAME: Base model to fine-tune
  • DATA_PATH: Path to your training data file
  • OUTPUT_DIR: Directory to save the model
  • HF_REPO_NAME: Repository name on Hugging Face Hub
  • LORA_R: Rank of the LoRA matrices
  • LORA_ALPHA: Scaling factor for LoRA layers
  • LORA_DROPOUT: Dropout probability for LoRA layers
  • BATCH_SIZE: Batch size for training
  • GRADIENT_ACCUMULATION_STEPS: Number of steps to accumulate gradients
  • LEARNING_RATE: Learning rate for training
  • NUM_EPOCHS: Number of training epochs
  • MAX_SEQ_LENGTH: Maximum sequence length for training

License

This project is licensed under the MIT License - see the LICENSE file for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zayzay58/lora-DBT-model

Finetuned
(113)
this model