Introducing : 🤏🏻🏭SmolFactory

Community Article Published August 10, 2025
SVG as background

SmolFactory output SmolFactory is an end-to-end model maker and deployer on HuggingFace
  • 🤏🏻🏭SmolFactory helps you make SmolLM3 or gpt-oss models easily , cheaply , and quickly
  • It allows you to deploy SFT or DPO fine-tuning experiments
  • automatically uploads your model
  • automatically deploys live tracking of your model-making experiment
  • automatically deploys an application where you can test your model
  • helps you load your data easily
  • takes care of experiment parameters
  • allows you to customize your training (for advanced users)
  • deploys an API and MCP server with your model
  • You can use all this immediately

What is SmolFactory

🤏🏻🏭SmolFactory is a one-click way for you to make finetuned models , track your finetune , publish your model and test it immediately.

What It's For

  • easily and cheaply finetune open source models
  • quickly train and deploy a model
  • track the process on the web & mobile

Who it's For

  • learners
  • businesses
  • opensource advocates

Businesses

The value promised by AI/ML really hasnt been realized by businesses , and yet with a miniscule budget and a positive attitude , it's truly possible for enterprises to produce extremely valuable intellectual property that adds revenue and company value almost immediately.

Developpers

What i've noticed is that if they are not spoonfed training recipes less than 0.01% of developpers are incapable of training a model.

Check Out Some Results

use the interactive demo above to see a SmolLM3 model trained on `legmlai`'s hermes-fr data for french understanding

The Inspiration

What I care about personally :

  • maybe i'm a boring older guy now but i actually care about :
    • corporate law
    • business law
    • business understanding
    • french language
    • reasoning abilities in the topics above

but you can use 🤏🏻🏭SmolFactory for what you like

SmolLM3

Have you actually checked out SmolLM3 by Elie Bakoush and

the amazing team of SmolLM3 (click to expand)Elie Bakoush, Carlos Miguel Patiño, Anton Lozhkov, Edward Beeching, Aymeric Roucher, Nouamane Tazi, Aksel Joonas Reedi, Guilherme Penedo, Hynek Kydlicek, Clémentine Fourrier, Nathan Habib, Kashif Rasul, Quentin Gallouédec, Hugo Larcher, Mathieu Morlon, Joshua, Vaibhav Srivastav, Xuan-Son Nguyen, Colin Raffel, Lewis Tunstall, Loubna Ben Allal, Leandro von Werra, Thomas Wolf
I'm actually asking because it's such a cool model ! One the most impressive capabilities of SmolLM3 is super-long context understanding capabilities . You can literally dump a book (or three!) into its context window and it can reason about questions that other (much larger and expensive) models can't. It's also verifiably open source , meaning it's observably trained and trained end-to-end on completely licenced data , meaning it's commercially viable in the sense that its licence is actually open source and its data is too ! Of course , it's also a reasoning model , and it's format is xml , meaning it's super easy to produce reasoning traces for other purposes as well as very cost-effective to actually use it for structured data generation - like the json folks need and use for agents ! Of course it's also a really capable multi-lingual model that does math . So what's not to like ? And yet :

image/png

hardly anyone has actually worked with SmolLM3 !

The thing fits on a small machine and by the way - the Huggingface TB Team has released the checkpoints so you can also retrain it - and nobody's really working with it! Pure insanity , it's really one of the best models in the world in my opinion. SmolLM3 proves that smart engineering and open collaboration can deliver compact, capable, and versatile models that are efficient, innovative, and ready to scale. If you want to see something cool, check out PyTorch's quant that literally fits on a phone using executorch ! By the way , this means that you could fine tune SmolLM3 to execute android commands and run your phone using this dataset!

GPT-OSS

OpenAI's GPT-OSS is their latest - not their only ! - opensource model . It was published last week to a "mixed reception", but I personally think it's really fantastic . First off , they're open : released under Apache‑2.0 licence so you can fine‑tune, modify, and distribute commercially. Second, they're really smart reasoning models on topics like coding , math , science , and paired with multi-lingual ability , this means that they're able to continue being smart on other topics like for example business , law and multi-lingual understanding. The 120B version can run on a single 80 GB GPU (consumer-ish grade , if you're a consumer that spends $12K on hardware for compute); the 20B fits on 16 GB VRAM (basically a gaming laptop). They have a long context of 128k tokens. And further quantizations can really make them quite small and useable. Transparent weights, hackable, and “own‑your‑stack” deployment that pairs beautifully with [SmolFactory’s workflows](# How It Works).

France

Here in France , we have a ton of talent. For example we have Meta's FAIR with an office here producing a ton of research. It's also the birthplace of literally huggingface. You can't really talk about ai in France without at least mentioning Mistral. By the way we host outstanding contributors like Maziyar Panahi. The government here is also on huggingface literally pushing open source datasets , and computing is a public good which is super cool. And yet , nothing much is happening . Government sponsored sovreign ai initiatives are a disaster mostly made by webdevs from russia , and being toxic, misogenist and colonialist . Meanwhile the government has stopped maintaining their own leaderboard for french understanding probably because the models they sponsored are all on the bottom of the list - when they place at all. Meanwhile community models and initiatives continue to outpace any government-sponsorred project by leaps and bounds. Bright lights include Legml.ai clean dataset and french-speaking model and open source french-speaking leaderboard and business understanding leaderboard.

What We're Trying To Do Here

  • I was trying to :
    • get my name and model on the leaderboards
    • finetune a mode
  • What this does :
    • helps you train a model
    • publish your results on huggingface

🤏🏻🏭SmolFactory

there's two ways to use 🤏🏻🏭SmolFactory :

Get Started

🤏🏻🏭SmolFactory works on any cloud provider but we'll show you how to use it with huggingface.

Get your huggingface tokens

Read token
Click on create token
Create token
Make a read token
Write token
Make a write token
  • write these down somewhere , for example Notepad on windows

Duplicate the Interface

visit https://huggingface.co/spaces/Tonic/SmolFactory

you'll find image/png

SmolFactory is an end-to-end model maker and deployer on HuggingFace

the first thing you'll notice is the warning at the top of the site.

click on duplicate this space and you'll find

image/png

Duplicate this space requires you to add your tokens

add your READ and Write tokens in the space provided

Select A GPU

Before you click Deploy you should select a GPU

image/png

Select your GPU from this list

L4s and A10Gs work great for SmolLM3 !

Click Deploy This Space

Once you've deployed the space with a GPU the warning will disappear and be replaced by a validation message .

Use the Interface

Now you can use the interface to train your model , deploy your training monitoring and upload and deploy your model !

configure your run in simple steps

image/png

Click on SmolLM3 to get Started
  • Click on SmolLM3 to get Started
    • You might have to select gpt-oss at first to reveal the interface
    • just select SmolLM3 again if that's the case

Use Defaults

image/png

The interface provides defaults
  • The interface provides you with customizable defaults
  • add your name in the provided field for your published model card

Click Run Pipeline

  • And that's really about it , you should see the training commands get executed in the logs below

[add gif, or video]

Advanced Configuration

There's also a simple interface for advanced configuration of experiments where you can choose a few parameters. This isnt so complicated , but unless you really care , let's not get into it here.

What it Does

SVG as background

SmolFactory output
  • Once you've clicked Run Pipeline Five Things Happen

SmolFactory Starts Your Training

"

The training will start

SmolFactory Tracks Your Training

The training will be tracked in a huggingface dataset
  • a dataset with all your training parameters and data will be created
  • it will be updated live

SmolFactory Monitors Your Training with a Custom Monitor

The training will be monitored in a space
  • A custom/personal experiment tracker tracks your training
  • To find it , navigate to your profile (dont close the SmolFactory Window)
  • Click on the latest experiment (first one in the list on your dashboard)
  • you can use this on mobile to track your training on-the-go

SmolFactory Uploads Your Model (after the training is completed)

image/png

Your model is automatically Uploaded

SmolFactory Deploys A Demo Space

Your model is demo'ed on a huggingface Space
  • Use the interactive demo above
  • SmolFactory automatically deploys a demo space
  • You need to click on settings in the top right hand side of your space
  • Then select ZeroGPU - this requires a Pro Account
  • That's it !

Congratulations

  • You Just
    • Pubished a Model
    • Tracked Your Training
    • Deployed a Demo on Huggingface !

Congratulations !

Now What ?

Well, now what ?

That's up to you , the world is your oyster !

  • promote your model by writing a post
  • Use your model via API or MCP
  • Share your model with the world !

Your model is now available also via API and MCP !

image/png

Your model is now available also via API and MCP !
  • scroll to the bottom of your demo and click on Use Via API or MCP
  • you can already use your model in your app or your agents !

How it works

image/png

Entry Point

  • Role: Orchestrates the run. Initializes configuration, model, data, and training logic.
  • Starts: Validation, environment setup, and the end-to-end workflow.

Configuration Management

graph LR
    Configuration_Management["Configuration Management"]
    Training_Orchestration["Training Orchestration"]
    Training_Orchestration -- "retrieves configuration from" --> Configuration_Management
    click Configuration_Management href "https://github.com//Josephrp/SmolFactory/blob/main/SmolFactory/docs/blob/Configuration_Management.md" "Details"
  • Purpose: Central source of truth for hyperparameters, paths, and runtime options.
  • Guarantee: Validated, reproducible experiments via structured settings.

Model Abstraction

graph LR
    EntryPoint["EntryPoint"]
    Model_Abstraction["Model Abstraction"]
    EntryPoint -- "initiates model loading in" --> Model_Abstraction
    click Model_Abstraction href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Model_Abstraction.md" "Details"
  • What it does: Loads and prepares models (e.g., quantization, adapters) and the tokenizer.
  • Why: Keeps training logic independent from specific architectures.

Data Pipeline

image/png

  • Scope: Dataset loading, preprocessing/tokenization, splitting, and performant dataloaders.
  • Output: Clean, batched data streams for training/eval.

Training Orchestrator

image/png

  • Loop: Forward/backward passes, optimization, checkpointing, evaluation, callbacks.
  • Integration: Works with acceleration libraries and logging/metrics collectors.

Real-time tracking and dataset flow

  • Live monitor: A lightweight monitor collects metrics during training.
  • Dual storage: Metrics flow to a Trackio Space API and a Hugging Face Dataset for history.
  • Visualization: Web UI shows real-time curves and experiment parameters.
  • Persistence: Final artifacts and logs remain accessible after runs.

What gets produced

  • Local outputs: Checkpoints, configs, training summaries, logs.
  • HF Dataset: Structured experiment records and time series of metrics.
  • Model repo: Final model weights with a model card and metadata.
  • Demo Space: Optional interactive app to try the model immediately.

End-to-end sequence

Step-by-step
  • A. User starts pipeline: Entry point validates settings and environment.
  • B. Auth & config: Tokens/env vars loaded; run is parameterized.
  • C. Dataset repo setup: Creates/uses a dataset repo for experiment logging.
  • D. Trackio deployment: Prepares a monitoring space endpoint for live updates.
  • E. Training execution: Model trains; callbacks emit metrics and checkpoints.
  • F. Real-time data flow: Metrics stream to Space + Dataset for visualization and history.
  • G. Model push & cleanup: Upload final model; summarize outputs.

Track-Tonic Monitoring, logging, and callbacks

Overview

  • SmolLM3Monitor: Central object for experiment tracking with dual targets: Trackio Space and Hugging Face Datasets. Supports modes: both, dataset, trackio, none.
  • Trainer callback: A custom transformers.TrainerCallback streams logs, checkpoints, eval results, and light system metrics.
  • TRL compatibility layer: A trackio module with init/log/finish mirrors TRL’s logging interface for drop-in usage.

See : SmolFactory

References: Transformers Trainer, Transformers Callbacks, TRL Logging.

Monitoring modes and configuration

  • Env/args:
    • MONITORING_MODE: both | dataset | trackio | none
    • HF_TOKEN: required for dataset writes and Trackio Space auth
    • TRACKIO_URL or TRACKIO_SPACE_ID: select the Trackio Space
    • TRACKIO_DATASET_REPO: HF dataset repo (default tonic/trackio-experiments)
    • TRACKIO_FLUSH_INTERVAL: metrics batching frequency (default 10)
  • Constructor: SmolLM3Monitor(experiment_name, trackio_url, hf_token, dataset_repo, monitoring_mode, ...) resolves env fallbacks.

Dataset persistence mechanics

  • Merges by experiment_id, de‑dups metrics by (step, timestamp), and normalizes to nested entries: {timestamp, step, metrics: {...}}.
  • Parameters are dict-merged; artifacts/logs are de‑duplicated while preserving order.
  • Saves on a cadence via TRACKIO_FLUSH_INTERVAL and on close; final status and counts are recorded.

Trackio Space integration

  • Validates space connectivity at init; gracefully degrades to dataset‑only if unavailable.
  • Logs configuration, metrics, checkpoints, and summary via the TrackioAPIClient.

Transformers Trainer callback integration

  • The monitor exposes create_monitoring_callback() which returns a transformers.TrainerCallback implementing:
    • on_log: enriches logs with step time, throughput (if tokens available), token stats, optional token accuracy, then emits to backends
    • on_save: logs checkpoints when saved
    • on_evaluate: records eval metrics per step
    • on_train_begin / on_train_end: lifecycle markers and final flush

Example usage with Trainer:

from transformers import Trainer, TrainingArguments

monitor = SmolLM3Monitor(
    experiment_name="smollm3_run",
    trackio_url=os.getenv("TRACKIO_URL"),
    hf_token=os.getenv("HF_TOKEN"),
    dataset_repo=os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments"),
    monitoring_mode=os.getenv("MONITORING_MODE", "both"),
)

trainer = Trainer(
    model=model,
    args=TrainingArguments(output_dir="./output"),
    train_dataset=train_ds,
    eval_dataset=eval_ds,
    callbacks=[monitor.create_monitoring_callback()],
)

trainer.train()

See: Transformers Trainer, Callbacks API.

TRL‑compatible trackio interface

  • The provided trackio module mirrors TRL’s logger with:
    • init(project_name=None, experiment_name=None, ...) -> experiment_id
    • log(metrics: dict, step: Optional[int] = None)
    • finish()
    • helpers: log_config, log_checkpoint, log_evaluation_results, get_experiment_url, is_available
  • To ensure a single shared experiment, you can hand the created monitor to the TRL wrapper:
import trackio

monitor = SmolLM3Monitor("smollm3_run", monitoring_mode="both")
trackio.set_monitor(monitor)  # share the same experiment_id

trackio.init(experiment_name="smollm3_run")
# ... training loop (TRL or custom) calling trackio.log(...)
trackio.finish()

See: TRL Logging.

Emitted signals and artifacts

  • Config: log_configuration() and JSON artifact copy
  • Metrics: log_metrics() with periodic dataset flush; light system metrics via psutil and CUDA stats when available
  • Checkpoints: log_model_checkpoint() records path/size and appends to artifacts
  • Evaluation: log_evaluation_results() per eval step
  • Summary: log_training_summary() with run duration and counts; close() sets final status and persists any tail metrics

HF Datasets quickstart

Hugging Face Datasets: how we persist metrics

  • We keep one logical row per experiment update and union‑merge by experiment_id (non‑destructive). JSON fields are merged with de‑duplication; scalars are overwritten by the latest. This ensures history is preserved and new metrics are appended instead of replacing prior data.
  • Periodic flush: metrics buffered in memory are pushed every TRACKIO_FLUSH_INTERVAL steps and once again on close().
  • The dataset split is train, private by default, and addressable with your HF_TOKEN.

Minimal write/append example:

import os, time
from datasets import Dataset, load_dataset
from huggingface_hub import create_repo

HF_TOKEN = os.getenv("HF_TOKEN")
REPO_ID = os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments")

def upsert_records(repo_id: str, hf_token: str, new_rows: list[dict]):
    # Ensure repo exists (dataset type)
    create_repo(repo_id, token=hf_token, repo_type="dataset", exist_ok=True)

    try:
        base = load_dataset(repo_id, split="train", token=hf_token)
        rows = base.to_list()
        rows.extend(new_rows)
        ds = Dataset.from_list(rows)
    except Exception:
        ds = Dataset.from_list(new_rows)

    # Push updated split (simple full overwrite strategy)
    ds.push_to_hub(repo_id, token=hf_token, private=True)

# Example during training
for step in range(3):
    metrics = {"timestamp": time.time(), "step": step, "metrics": {"loss": 0.1 * (3 - step)}}
    upsert_records(REPO_ID, HF_TOKEN, [metrics])

Read and flatten nested metrics:

import pandas as pd
from datasets import load_dataset

def load_metrics_df(repo_id: str, hf_token: str) -> pd.DataFrame:
    ds = load_dataset(repo_id, split="train", token=hf_token)
    rows = []
    for r in ds:
        d = {"timestamp": r.get("timestamp"), "step": r.get("step")}
        for k, v in (r.get("metrics") or {}).items():
            if isinstance(v, (int, float)):
                d[k] = v
        rows.append(d)
    return pd.DataFrame(rows).sort_values("step")

Gradio quickstart

Gradio in monitoring

  • Gradio is used as the live dashboard UI in a Space: it pulls the dataset, flattens nested metrics, and renders simple plots (e.g., loss vs step).
  • An internal timer triggers a refresh (polling every few seconds). If a backend API is available, metrics can also be posted directly; otherwise, the UI reads from the dataset.
  • The snippet below illustrates the core idea of “load → flatten → plot” :
import os
import pandas as pd
import gradio as gr
from datasets import load_dataset

REPO_ID = os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments")
HF_TOKEN = os.getenv("HF_TOKEN")

def fetch_df() -> pd.DataFrame:
    ds = load_dataset(REPO_ID, split="train", token=HF_TOKEN)
    rows = []
    for r in ds:
        d = {"step": r.get("step")}
        for k, v in (r.get("metrics") or {}).items():
            if isinstance(v, (int, float)):
                d[k] = v
        rows.append(d)
    if not rows:
        return pd.DataFrame({"step": [], "loss": []})
    return pd.DataFrame(rows).sort_values("step")

with gr.Blocks() as demo:
    gr.Markdown("## Training metrics (auto-refresh)")
    plot = gr.LinePlot(value=fetch_df(), x="step", y="loss")
    gr.Timer(5.0, lambda: gr.LinePlot.update(value=fetch_df()))

Or: post metrics directly to a Space backend endpoint

import os, requests, time

SPACE_URL = os.getenv("TRACKIO_URL")  # e.g. https://tonic-track-tonic.hf.space
HF_TOKEN = os.getenv("HF_TOKEN")

payload = {"experiment_id": "exp_123", "step": 42, "metrics": {"loss": 0.05}}
r = requests.post(
    f"{SPACE_URL}/api/log_metrics",
    headers={"authorization": f"Bearer {HF_TOKEN}"},
    json=payload,
    timeout=10,
)
r.raise_for_status()

Frequently Asked Questions

is this a unique idea ?

  • not at all ! there are many applications that currently do this :

  • I made SmolFactory to be simpler and cheaper than these alternatives

Why did you do this ?

  • Just to share an easy way to train
  • to share training recipes as we develop them

My training doesnt work , what is this bug ?

I need help or I want to contribute

  • Please join us on discord
  • 🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 Join us on Discord

Parting Thoughts

I originally started making this just for myself because i find SmolLM3 brilliant , if you have any thoughts about doing better with data and quantization , trainings and trainers , or want to share configs, we're really all ears !

image/jpeg

Community

Sign up or log in to comment