bart-base-instruct: dolly_hhrlhf

Open In Colab

This model is a fine-tuned version of facebook/bart-base on the pszemraj/dolly_hhrlhf-text2text dataset.

Model description

text2text models fine-tuned on a modified dataset for text2text generation based on the relatively more permissive mosaicml/dolly_hhrlhf dataset.

Basic usage in Python:

# pip install -q transformers accelerate
from transformers import pipeline, GenerationConfig

model_name = "pszemraj/bart-base-instruct-dolly_hhrlhf"
assistant = pipeline(
    "text2text-generation",
    model_name,
    device_map="auto"
)
cfg = GenerationConfig.from_pretrained(model_name)

# pass an 'instruction' as the prompt to the pipeline
prompt = "Write a guide on how to become a ninja while working a 9-5 job."
result = assistant(prompt, generation_config=cfg)[0]["generated_text"]
print(result)

using the generation config is optional, can subsitute with other generation params.

Intended uses & limitations

  • this is not tuned with RLHF etc, and may output offensive results
  • this model is rather small (~600 MB) and therefore it's "cognition" abilities are rather limited.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3.0
Downloads last month
137
Safetensors
Model size
139M params
Tensor type
F32
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Dataset used to train pszemraj/bart-base-instruct-dolly_hhrlhf

Spaces using pszemraj/bart-base-instruct-dolly_hhrlhf 4