ONNX
t5

Model Card for Smart-Tab-Topic

This generative model creates a concise 1-3 word label for a Web Browser's tab group, based on the titles of one or more pages.

It is used by Mozilla's AI tab grouping feature.

Model Details

The model was created by fine tuning a data-set on a flan-t5-base model, then applying distillation of that model onto google/t5-efficient-tiny model.

Model Description

  • Developed by: mozilla
  • Language(s) (NLP): English
  • Finetuned from model: google/flan-t5-base, google/t5-efficient-tiny

Model Sources

Training code is available here https://github.com/mozilla/smart-tab-grouping, though training data is not available.

Uses

Direct Use

The model has a strict input in the following formats.

Topic from keywords: [up to 3 comma-separated-lower-case-keywords]. titles: \n [up to 3 \n separated titles]

Keywords is optional and should not be included for single tab use cases. For multi-tab use cases the keywords should be representative keywords for all tab title and/or descriptions in the suggested tab group. In our implemention we apply the C-TFIDF algorithm to select keywords from up to 10 documents.

Example inputs:

Topic from keywords: . titles \n Dogs - Google Search
Topic from keywords: dogs,food,pets. titles: \nDogs - Google Search\nDog Food - Shopping\nHow can I buy a pet - Google Search

Bias, Risks, and Limitations

The model has had some basic testing to not output undersirable responses.

It filters some swear words. In some instances it may output 'None' when uncertain or 'Adult Content' for inappropriate conetnt. Those two responses should be filtered out by systems that use the model.

Training Details

Training data was created using OpenAI to create archetypes of 50 fake users and their imagined browsing activity for various tasks.

Page titles from those synthetic pages were clustered, and then labeld using OpenAI, using an n-shot approach with hand-labeled examples in each query.

Training data was augmented with page titles extracted from thosands of English page titles Common Crawl dataset.

An additional pre-processing step applied during training removes less important words from the training topic labels in order to shorten the topic to 1 word in most cases.

The training data was used to fine-tune a flan-t5-base model, which was later distilled on a t5-efficient-tiny model.

The model was then quantized to q8 (8 bit precision) for use in production Firefox.

Compute Infrastructure

Tuned on an H100 GPU

Downloads last month
101
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support