bart-large-mnli: instruction tuned - v1

Open In Colab

This model is a fine-tuned version of facebook/bart-large-mnli on the pszemraj/dolly_hhrlhf-text2text dataset.

Model description

text2text models fine-tuned on a modified dataset for text2text generation based on the relatively more permissive mosaicml/dolly_hhrlhf dataset.

Basic usage in Python:

# pip install -q transformers accelerate
import torch
from transformers import pipeline, GenerationConfig

model_name = "pszemraj/bart-large-mnli-instruct-dolly_hhrlhf-v1"
assistant = pipeline(
    "text2text-generation",
    model_name,
    device_map="auto",
)
cfg = GenerationConfig.from_pretrained(model_name)

# pass an 'instruction' as the prompt to the pipeline
prompt = "Write a guide on how to become a ninja while working a 9-5 job."
result = assistant(prompt, generation_config=cfg)[0]["generated_text"]
print(result)

The use of the generation config is optional, it can be replaced by other generation params.

Intended Uses & Limitations

  • This is not tuned with RLHF, etc, and may produce offensive results.
  • While larger than BART-base, this model is relatively small compared to recent autoregressive models (MPT-7b, LLaMA, etc.), and therefore it's "cognition" capabilities may be practically limited for some tasks.

Training

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3.0
Downloads last month
41
Safetensors
Model size
406M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Dataset used to train pszemraj/bart-large-mnli-dolly_hhrlhf-v1

Spaces using pszemraj/bart-large-mnli-dolly_hhrlhf-v1 4