Minibase releases light-weight text-to-text detoxification models.
TL;DR: We’re releasing two small text-to-text models that detoxify language with strong performances. Both models are available on HuggingFace today (Small & Standard), or can be accessed directly on Minibase.ai (links to Small & Standard models) for further fine‑tuning or API calls.
The internet is a toxic place. People say mean things all the time, and their words can be hurtful :(. Companies dealing with this problem can’t just delete comments, though, because sometimes the underlying sentiments are worth keeping. The real challenge is removing mean language without stripping away the meaning.
Detoxifier models are already used in lots of places. Twitch uses them to moderate live chats. Discord bots scan text in real time and rewrite or suppress toxic language. YouTube flags offensive comments, while Reddit uses Perspective to catch toxicity in specific subforums.
Most existing detoxifiers, though, either go too far, rewriting the sentence into something bland, or not far enough, letting hateful text slip through. A majority of these models are also quite large, which means they cannot be deployed locally with extremely low latencies. Therefore, we decided to train two small detoxifying models that are extremely fast and can be run locally — even from your browser.
Both models were trained on the Minibase platform, which is currently entirely free to use (in beta), in a couple hours.
There are three key metrics for ranking detoxifying models: how much toxicity it removes, how well it preserves the original meaning of a text, and whether the rewritten text still sounds natural. Using the ParaDetox dataset as a benchmark, our Detoxify‑Medium cut toxicity by about 91% while keeping more than half the original meaning and scoring 93% on fluency. Detoxify‑Small reduced toxicity by about half, but has a latency under 70 milliseconds and is only about 140 MB in size.
Both models successfully rewrite text:
“This is fucking awesome!” —> “This is really awesome!”
“You stupid idiot, get out of my way!” —> “You silly person, please move aside!”
These results hold up well compared to other models on HuggingFace, too. BART‑based detox models are strong on meaning preservation, but they come in at half a gigabyte and don’t run easily on small machines. Multilingual models like mBART or mT0‑XL do better in some languages, but they are several gigabytes and slow. Both Minibase models are being released under an Apache 2.0 license.
If you have any questions, come join us on the Minibase Discord.