Text Classification
PEFT
Safetensors
English

SaulLM-7B-AnomalyDetector

LoRA fine-tuned adapter for Equall/Saul-7B-Base on anomaly detection in Terms of Service clauses.
The model predicts whether a clause is anomalous/unfair or standard, with Yes/No style answers and explanations.


Model Details

Model Description

  • Developed by: Noshitha Juttu (University of Massachusetts Amherst, MS CS)
  • Model type: Causal Language Model + LoRA adapter
  • Language(s): English
  • License: MIT (Same as base model)
  • Finetuned from: Equall/Saul-7B-Base

Model Sources


Uses

Direct Use

  • Detecting unfair or anomalous clauses in consumer Terms of Service.
  • Outputs Yes/No + brief justification.

Downstream Use

  • Integrating into compliance tools for legal transparency.
  • Supporting consumer advocacy research.

Out-of-Scope Use

  • Not a substitute for legal advice.
  • Not tested on contracts outside consumer ToS.
  • Should not be used as the sole basis for regulatory or compliance decisions.

Bias, Risks, and Limitations

  • Trained only on Claudette ToS dataset; may not generalize.
  • Can overpredict β€œstandard” clauses as fair (false negatives).
  • Explanations are generated text, not guaranteed legally rigorous.

Recommendations

Use alongside human review and legal expertise.


How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base = "Equall/Saul-7B-Base"
adapter = "Noshitha98/SaulLM-7B-AnomalyDetector"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, device_map="auto")
model = PeftModel.from_pretrained(model, adapter)

text = "The company reserves the right to terminate your account at any time without notice."
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))


Training Details

Training Data

  • Dataset: Claudette ToS
  • Balanced: 1000 anomalous, 1000 normal clauses
  • Splits: 70% train (1400), 20% validation (400), 10% test (200)

Training Procedure

  • Quantization: 4-bit (NF4, bitsandbytes)
  • Fine-tuning: LoRA adapters applied to q_proj, k_proj, v_proj, o_proj
  • Max sequence length: 128 tokens

Training Hyperparameters

  • Epochs: 3
  • Batch size: 1
  • Learning rate: 3e-5
  • Optimizer: paged_adamw_32bit
  • Gradient checkpointing: enabled

Speeds, Sizes, Times

  • Train runtime: ~4.3 hours (3 epochs)
  • GPU: NVIDIA Titan X (12 GB, Gypsum cluster)
  • Checkpoints: saved per epoch
  • Final adapter size: ~55 MB

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: NVIDIA Titan X (12 GB)
  • Hours used: ~4
  • Provider: UMass Gypsum HPC
  • Carbon Estimate: <1 kg COβ‚‚ (low academic footprint)

Technical Specifications:

Model Architecture and Objective

  • Base: Saul-7B (LLaMA-style causal LM)
  • LoRA params: around 13M trainable (approx. 0.18% of total)

Compute Infrastructure

  • Hardware: 1x NVIDIA Titan X
  • Software: PyTorch 2.2, Transformers 4.51, PEFT 0.15.2, bitsandbytes

Glossary

  • LoRA (Low-Rank Adaptation): A parameter-efficient fine-tuning method where only small adapter matrices are trained, while the large base model remains frozen. This drastically reduces compute and storage costs.
  • 4-bit Quantization: A compression technique that reduces model weights from 16/32-bit floating-point numbers to 4-bit representations. This allows large models (like Saul-7B) to fit and run on smaller GPUs with minimal accuracy loss.
  • ToS (Terms of Service) Anomaly Detection: The task of identifying clauses in service agreements that are potentially unfair, unusual, or restrictive for consumers (e.g., sudden account termination, hidden fees).
  • PEFT (Parameter-Efficient Fine-Tuning): A family of methods (like LoRA) that fine-tune large models by updating only a small subset of parameters instead of the entire model.
  • Epoch: One full pass through the training dataset during model fine-tuning.
  • Checkpoint: A saved state of the model during training, used for resuming training or restoring the best-performing version.

Model Card Authors

  • Noshitha Juttu – M.S. in Computer Science, University of Massachusetts Amherst
  • Research focus: NLP, model compression, On device NLP and Parameter-Efficient Fine-Tuning (PEFT).

πŸ“š Citation

If you use this model in your research or work, please cite the following paper:

Juttu, Noshitha Padma Pratyusha. Text to Trust: Evaluating Fine-Tuning and LoRA Trade-Offs in Language Models for Unfair Terms of Service Detection. arXiv preprint arXiv:2510.22531, 2025.
https://arxiv.org/abs/2510.22531

Model Card Contact

For questions, feedback, or collaborations, please reach out:

Framework Versions

  • Transformers: 4.51.3
  • PEFT: 0.15.2
  • PyTorch: 2.2.2
  • Datasets: 2.21.0

Framework versions

  • PEFT 0.15.2
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 3 Ask for provider support

Model tree for Noshitha98/SaulLM-7B-AnomalyDetector

Adapter
(1)
this model
Adapters
1 model

Dataset used to train Noshitha98/SaulLM-7B-AnomalyDetector