LoRA Adapter for Mistral-7B-Instruct - Disco Elysium Internal Debate Style

This repository contains LoRA (Low-Rank Adaptation) adapters fine-tuned on the mistralai/Mistral-7B-Instruct-v0.2 base model. The goal of this fine-tuning was to imbue the model with the unique literary style, tone, and internal thought processes found in the video game Disco Elysium, specifically focusing on generating multi-skill internal debates.

Model Description

This is not a standalone model, but rather a set of LoRA weights intended to be loaded on top of the mistralai/Mistral-7B-Instruct-v0.2 model using the PEFT library.

The fine-tuning was performed on a structured dataset derived directly from the Disco Elysium dialogue database. This dataset was specifically formatted to teach the model how to generate plausible internal reactions from multiple in-game "Skills" (like Logic, Volition, Inland Empire, etc.) in response to a given context (dialogue line, situation, or thought).

The resulting model excels at:

Adopting the literary, verbose, darkly humorous, and philosophical tone of Disco Elysium.
Generating internal monologues reflecting Harry Du Bois's ("You") perspective.
Simulating the internal debate between various Skills, often presenting multiple viewpoints in a single response.

Intended Use

This model is intended for creative text generation, role-playing, and exploring the unique narrative style of Disco Elysium. It can be prompted to:

React to situations or dialogue snippets as if processing them through the game's internal skill system.
Generate internal monologues or journal entries in Harry's voice.
Produce plausible debates between different facets of the DE psyche.

It is NOT intended as a general-purpose chatbot or Q&A model. Its knowledge and style are heavily specialized.

How to Use

First, ensure you have the necessary libraries installed:

pip install torch transformers accelerate peft bitsandbytes datasets

Then, load the base model (quantized for efficiency) and apply the LoRA adapters:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

base_model_name = "mistralai/Mistral-7B-Instruct-v0.2"
adapter_repo_id = "canercetin/mistral-disco-adapter" # This repo

# Load quantization config (recommended)
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

# Load the base model
print(f"Loading base model ({base_model_name})...")
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    quantization_config=quantization_config,
    torch_dtype=torch.bfloat16,
    device_map="auto", # Handles GPU/CPU distribution
)
print("Base model loaded.")

# Load the tokenizer (from the adapter repo is often best)
print(f"Loading tokenizer from {adapter_repo_id}...")
tokenizer = AutoTokenizer.from_pretrained(adapter_repo_id)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    print("Set pad_token to eos_token")

# Load the PEFT model by applying adapters
print(f"Loading LoRA adapters from Hub: {adapter_repo_id}")
model = PeftModel.from_pretrained(base_model, adapter_repo_id)
print("LoRA adapters loaded and applied.")
model.eval() # Set to evaluation mode

# --- Prepare Prompt ---
# Use the recommended System Prompt (see below)
system_prompt = "Adopt *your* internal perspective within Disco Elysium. The input below presents a situation, external dialogue, or an internal thought. **Generate the ensuing internal debate** as different Skills (such as Logic, Volition, Empathy, Inland Empire, Electrochemistry, etc.) react to it. Present the distinct viewpoints, analyses, or urges from *multiple relevant skills*, each clearly identified (e.g., 'Skill Name: Dialogue text') and preferably on a new line. Capture the characteristic literary style, tangential thoughts, dark humor, and gritty realism of Revachol in this synthesized multi-voiced internal state. **Your output should be the multi-skill reaction itself.**"
user_input = "Kim looks disappointed in my methods again." # Example input

chat = [{"role": "system", "content": system_prompt}, {"role": "user", "content": user_input}]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

# --- Generate ---
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
print("\nGenerating multi-skill internal debate...")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=350,
        do_sample=True,
        temperature=0.75,
        top_p=0.9,
        repetition_penalty=1.1,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=[tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("</s>")]
    )

# --- Decode ---
input_token_length = inputs.input_ids.shape
response_ids = outputs[input_token_length:]
response_text = tokenizer.decode(response_ids, skip_special_tokens=True).strip()
print("\nModel Response:\n", response_text)

Example Prompts & Outputs

(Using the System Prompt above)

Input: `Should I piss myself?` Output: `Composure: A bad idea. Men like the lieutenant are not won over by such childish manoeuvres. Half Light: So worth it -- can you feel how liberating that was? De-compression completed.`

Input:

I love you.

Output:

Half Light: The fucking nerves on her?! Trying to hide behind WEAKNESS. You're a GOD. An angry but JUST God.
Rhetoric: This exact conversation has happened before. Establishing authority before trust -- classic mistake.

(Note: Output may vary due to sampling)

Input:

Lieutenant, put on the maid outfit.

Output:

Esprit de Corps: The lieutenant is concerned about your erratic behaviour -- and the potential  consequences for the case.
Rhetoric: He's trying not to bring you down, but the words aren't coming out right.
Suggestion: What are you talking about? Don't say that, this is no time to nitpick.

Limitations and Bias

Highly Specialized: This model will perform poorly on general knowledge, coding, or conversational tasks outside the Disco Elysium persona. It often fails to follow instructions that conflict with its trained style.
Potential for Offensive Content: Disco Elysium contains mature themes, coarse language, discussions of violence, addiction, trauma, suicide, and various offensive ideologies (racism, fascism, etc.). The model was trained on this text and may generate offensive, harmful, or inappropriate content without warning. User discretion is strongly advised. This model is intended for artistic exploration of the game's style and should not be used in sensitive applications.
Inconsistency: While trained to generate multi-skill debates, it may sometimes produce shorter responses, repeat itself, fail to invoke multiple skills, or generate responses that are nonsensical or only tangentially related to the prompt.
Factual Hallucinations: It may generate text that seems like DE lore but is inaccurate or invented.
Character Limit Sensitivity: Providing character limits in the prompt may sometimes be ignored or lead to awkward truncation.

Training Data

This model was fine-tuned on a processed dataset derived from a publicly available Disco Elysium dialogue database (discobase3-29-2021-9-32-09-PM.db). The data was structured into Context -> Multi-Skill Response pairs using a custom Python script. The specific steps included:

Loading dialogue entries and actor information.
Identifying context lines (dialogue from non-Skill, non-HUB actors).
Looking ahead a fixed number of steps (20) within the same conversation to find subsequent lines spoken by different Skill actors.
Filtering out system entries (Actor 0), '0' text, bracketed player choices/actions ([...]), and short lines (<10 chars).
Formatting valid sequences into <s>[INST] Context [/INST] Skill A: Response A\nSkill B: Response B... </s> strings.
This resulted in approximately 25,000 training samples.

Note on Dataset Publication: The underlying dialogue text is copyrighted material belonging to ZA/UM Studio. Therefore, the processed dataset itself cannot be shared publicly. However, the script used for processing could potentially be shared for reproducibility if users obtain the database file legally themselves.

Training Procedure

Base Model: mistralai/Mistral-7B-Instruct-v0.2
Fine-tuning Method: LoRA using PEFT library.
Configuration: r=16, lora_alpha=32, lora_dropout=0.1, targeted QKV, O, and MLP projections.
Quantization: Loaded base model in 4-bit using bitsandbytes (nf4, compute_dtype=bfloat16).
Frameworks: transformers, accelerate, peft, torch.
Hardware: A100 GPU (via Google Colab Pro).
Training Args: per_device_train_batch_size=12, gradient_accumulation_steps=1, dataloader_num_workers=12, gradient_checkpointing=False, num_train_epochs=1 (on the structured multi-skill dataset), learning_rate=2e-4, bf16=True, optim="paged_adamw_8bit".
Final Loss: ~0.41

Citation

If you use this model adapter, please cite the base model and the PEFT library.

Mistral 7B:

@misc{jiang2023mistral,
      title={Mistral 7B},
      author={Albert Q. Jiang and Alexandre Sablayrolles and Arthur Mensch and Chris Bamford and Devendra Singh Chaplot and Diego de las Casas and Florian Bressand and Gianna Lengyel and Guillaume Lample and Lélio Renard Lavaud and Lucile Saulnier and Marie-Anne Lachaux and Pierre Stock and Teven Le Scao and Thibaut Lavril and Thomas Wang and Timothée Lacroix and William El Sayed},
      year={2023},
      eprint={2310.06825},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

PEFT:

@misc{peft,
  author = {Sourab Mangrulkar and Sylvain Gugger and Lysandre Debut and Younes Belkada and Sayak Paul},
  title = {PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/huggingface/peft}}
}

Model fine-tuned by Caner Cetin.

---

canercetin
/

mistral-disco-adapter

LoRA Adapter for Mistral-7B-Instruct - Disco Elysium Internal Debate Style

Model Description

Intended Use

How to Use

Example Prompts & Outputs

Input: `Should I piss myself?` Output: `Composure: A bad idea. Men like the lieutenant are not won over by such childish manoeuvres. Half Light: So worth it -- can you feel how liberating that was? De-compression completed.`

Limitations and Bias

Training Data

Training Procedure

Citation

Model tree for canercetin/mistral-disco-adapter

LoRA Adapter for Mistral-7B-Instruct - Disco Elysium Internal Debate Style

Model Description

Intended Use

How to Use

Example Prompts & Outputs

Input: Should I piss myself? Output: Composure: A bad idea. Men like the lieutenant are not won over by such childish manoeuvres. Half Light: So worth it -- can you feel how liberating that was? De-compression completed.

Limitations and Bias

Training Data

Training Procedure

Citation

Model tree for canercetin/mistral-disco-adapter

Input: `Should I piss myself?` Output: `Composure: A bad idea. Men like the lieutenant are not won over by such childish manoeuvres. Half Light: So worth it -- can you feel how liberating that was? De-compression completed.`