Model Card for Smart-Tab-Topic
This generative model creates a concise 1-3 word label for a Web Browser's tab group, based on the titles of one or more pages.
It is used by Mozilla's AI tab grouping feature.
Model Details
The model was created by fine tuning a data-set on a flan-t5-base model, then applying distillation of that model onto google/t5-efficient-tiny model.
Model Description
- Developed by: mozilla
- Language(s) (NLP): English
- Finetuned from model: google/flan-t5-base, google/t5-efficient-tiny
Model Sources
Training code is available here https://github.com/mozilla/smart-tab-grouping, though training data is not available.
- T5 model Paper [https://arxiv.org/abs/1910.10683]:
- AI Enhanced Tab Groups HowTo [https://support.mozilla.org/en-US/kb/how-use-ai-enhanced-tab-groups]: AI enhanced tab groups
Uses
Direct Use
The model has a strict input in the following formats.
Topic from keywords: [up to 3 comma-separated-lower-case-keywords]. titles: \n [up to 3 \n separated titles]
Keywords is optional and should not be included for single tab use cases. For multi-tab use cases the keywords should be representative keywords for all tab title and/or descriptions in the suggested tab group. In our implemention we apply the C-TFIDF algorithm to select keywords from up to 10 documents.
Example inputs:
Topic from keywords: . titles \n Dogs - Google Search
Topic from keywords: dogs,food,pets. titles: \nDogs - Google Search\nDog Food - Shopping\nHow can I buy a pet - Google Search
Bias, Risks, and Limitations
The model has had some basic testing to not output undersirable responses.
It filters some swear words. In some instances it may output 'None' when uncertain or 'Adult Content' for inappropriate conetnt. Those two responses should be filtered out by systems that use the model.
Training Details
Training data was created using OpenAI to create archetypes of 50 fake users and their imagined browsing activity for various tasks.
Page titles from those synthetic pages were clustered, and then labeld using OpenAI, using an n-shot approach with hand-labeled examples in each query.
Training data was augmented with page titles extracted from thosands of English page titles Common Crawl dataset.
An additional pre-processing step applied during training removes less important words from the training topic labels in order to shorten the topic to 1 word in most cases.
The training data was used to fine-tune a flan-t5-base model, which was later distilled on a t5-efficient-tiny model.
The model was then quantized to q8 (8 bit precision) for use in production Firefox.
Compute Infrastructure
Tuned on an H100 GPU
- Downloads last month
- 101