LoRA Fine-tuning for DBT Model

This repository contains code for fine-tuning a language model using Low-Rank Adaptation (LoRA) on Dialectical Behavior Therapy (DBT) content. The model is based on SmolLM2-360M-Instruct and fine-tuned specifically on Marsha M. Linehan's DBT® Skills Training Manual, Second Edition (The Guilford Press, 2014).

Overview
Requirements
Project Structure
Setup
Training Process
Running Inference
Configuration Parameters
License

Overview

This project fine-tunes a language model specifically for understanding and generating content related to Dialectical Behavior Therapy (DBT). The model is trained on Marsha M. Linehan's authoritative DBT® Skills Training Manual (Second Edition), which is the definitive resource for DBT practitioners. By using LoRA (Low-Rank Adaptation), we can efficiently adapt pre-trained language models to specific domains without the need to retrain the entire model. This approach significantly reduces computational requirements while maintaining performance.

Requirements

Python 3.8+
PyTorch
Transformers
PEFT (Parameter-Efficient Fine-Tuning)
Datasets
Hugging Face Hub account
Weights & Biases account (for tracking experiments)

pip install torch transformers peft datasets huggingface_hub wandb langchain

Project Structure

.
├── README.md
├── fine_tune_dbt.py           # Main training script
├── linehan_guide.md           # Training data (chunked sections of Linehan's DBT® Skills Training Manual)
└── lora-DBT-model/            # Output directory for model

Setup

Clone this repository:

git clone https://github.com/yourusername/lora-dbt-model.git
cd lora-dbt-model

Set up your Hugging Face and Weights & Biases tokens:
- Replace 'hftoken' with your actual Hugging Face token
- Replace 'wandbkey' with your actual Weights & Biases key

Training Process

The training process includes the following steps:

Loading the base model (SmolLM2-360M-Instruct)
Loading and preprocessing the text data
Tokenizing the data
Configuring LoRA parameters
Training the model
Saving the LoRA adapters

To run the training:

python fine_tune_dbt.py

The script will automatically push the model to the Hugging Face Hub under the specified repository name.

Running Inference

After training, you can run inference using the Hugging Face pipeline. Here's how to use your fine-tuned model for chat:

from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

# Load the fine-tuned model
model_name = "zayzay58/lora-DBT-model"  # Replace with your actual model path or HF Hub ID

# Option 1: Load from Hugging Face Hub
config = PeftConfig.from_pretrained(model_name)
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base_model, model_name)

# Create a chat pipeline
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=512,
    temperature=0.7,
    top_p=0.95,
    repetition_penalty=1.15
)

# Format your input for chat - format depends on the base model you used
# For SmolLM2-360M-Instruct, the typical format is:
chat_input = "<|system|>You are a helpful assistant trained in DBT techniques.<|endoftext|><|user|>What are some DBT skills for emotion regulation?<|endoftext|><|assistant|>"

# Generate response
response = pipe(chat_input)[0]['generated_text']

# Extract just the assistant's response
assistant_response = response.split("<|assistant|>")[-1].strip()
print(assistant_response)

Chat Template

For interactive chat applications, you can use this template:

def chat_with_dbt_model(model, tokenizer, conversation_history=None):
    if conversation_history is None:
        conversation_history = [
            {"role": "system", "content": "You are a helpful assistant trained in DBT techniques."}
        ]
    
    formatted_prompt = ""
    for message in conversation_history:
        role = message["role"]
        content = message["content"]
        if role == "system":
            formatted_prompt += f"<|system|>{content}<|endoftext|>"
        elif role == "user":
            formatted_prompt += f"<|user|>{content}<|endoftext|>"
        elif role == "assistant":
            formatted_prompt += f"<|assistant|>{content}<|endoftext|>"
    
    # Add the final assistant token to get the model to generate a response
    if conversation_history[-1]["role"] != "assistant":
        formatted_prompt += "<|assistant|>"
    
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
    
    # Generate response
    output = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.95,
        repetition_penalty=1.15,
        do_sample=True
    )
    
    response = tokenizer.decode(output[0], skip_special_tokens=False)
    assistant_response = response.split("<|assistant|>")[-1].split("<|endoftext|>")[0].strip()
    
    return assistant_response

# Example usage
conversation = [
    {"role": "system", "content": "You are a helpful assistant trained in DBT techniques."},
    {"role": "user", "content": "What are some DBT skills for emotion regulation?"}
]

response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)

# Continue the conversation
conversation.append({"role": "assistant", "content": response})
conversation.append({"role": "user", "content": "Can you explain the TIPP skill in more detail?"})

response = chat_with_dbt_model(model, tokenizer, conversation)
print(response)

Configuration Parameters

You can customize the following parameters in the script:

MODEL_NAME: Base model to fine-tune
DATA_PATH: Path to your training data file
OUTPUT_DIR: Directory to save the model
HF_REPO_NAME: Repository name on Hugging Face Hub
LORA_R: Rank of the LoRA matrices
LORA_ALPHA: Scaling factor for LoRA layers
LORA_DROPOUT: Dropout probability for LoRA layers
BATCH_SIZE: Batch size for training
GRADIENT_ACCUMULATION_STEPS: Number of steps to accumulate gradients
LEARNING_RATE: Learning rate for training
NUM_EPOCHS: Number of training epochs
MAX_SEQ_LENGTH: Maximum sequence length for training

License

This project is licensed under the MIT License - see the LICENSE file for details.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zayzay58/lora-DBT-model

Base model

HuggingFaceTB/SmolLM2-360M

Quantized

HuggingFaceTB/SmolLM2-360M-Instruct

Finetuned

(113)

this model

zayzay58
/

lora-DBT-model