Introducing : 🤏🏻🏭SmolFactory
how to use huggingface
How to make models
how to finetune a model
How to make a model for business
How to make a model for medicine
How to make a model for local language understading
How to make an MCP server for AI
How to deploy an API for huggingface ai models?
How to use transformers
How to use torch
gpt oss finetune
- 🤏🏻🏭SmolFactory helps you make
SmolLM3
orgpt-oss
models easily , cheaply , and quickly - It allows you to deploy SFT or DPO
fine-tuning
experiments - automatically uploads your model
- automatically deploys live tracking of your model-making experiment
- automatically deploys an application where you can test your model
- helps you load your data easily
- takes care of experiment parameters
- allows you to customize your training (for advanced users)
- deploys an API and MCP server with your model
- You can use all this immediately
What is SmolFactory
🤏🏻🏭SmolFactory is a one-click way for you to make finetuned models , track your finetune , publish your model and test it immediately.
What It's For
- easily and cheaply finetune open source models
- quickly train and deploy a model
- track the process on the web & mobile
Who it's For
- learners
- businesses
- opensource advocates
Businesses
The value promised by AI/ML really hasnt been realized by businesses , and yet with a miniscule budget and a positive attitude , it's truly possible for enterprises to produce extremely valuable intellectual property that adds revenue and company value almost immediately.
Developpers
What i've noticed is that if they are not spoonfed training recipes less than 0.01% of developpers are incapable of training a model.
Check Out Some Results
use the interactive demo above to see a SmolLM3 model trained on `legmlai`'s hermes-fr data for french understandingThe Inspiration
What I care about personally :
- maybe i'm a boring older guy now but i actually care about :
- corporate law
- business law
- business understanding
- french language
- reasoning abilities in the topics above
but you can use 🤏🏻🏭SmolFactory for what you like
SmolLM3
Have you actually checked out SmolLM3 by Elie Bakoush and
the amazing team of SmolLM3 (click to expand)
— Elie Bakoush, Carlos Miguel Patiño, Anton Lozhkov, Edward Beeching, Aymeric Roucher, Nouamane Tazi, Aksel Joonas Reedi, Guilherme Penedo, Hynek Kydlicek, Clémentine Fourrier, Nathan Habib, Kashif Rasul, Quentin Gallouédec, Hugo Larcher, Mathieu Morlon, Joshua, Vaibhav Srivastav, Xuan-Son Nguyen, Colin Raffel, Lewis Tunstall, Loubna Ben Allal, Leandro von Werra, Thomas WolfThe thing fits on a small machine and by the way - the Huggingface TB Team has released the checkpoints so you can also retrain it - and nobody's really working with it! Pure insanity , it's really one of the best models in the world in my opinion. SmolLM3 proves that smart engineering and open collaboration can deliver compact, capable, and versatile models that are efficient, innovative, and ready to scale. If you want to see something cool, check out PyTorch's quant that literally fits on a phone using executorch
! By the way , this means that you could fine tune SmolLM3 to execute android commands and run your phone using this dataset!
GPT-OSS
OpenAI's GPT-OSS is their latest - not their only ! - opensource model . It was published last week to a "mixed reception", but I personally think it's really fantastic . First off , they're open : released under Apache‑2.0 licence so you can fine‑tune, modify, and distribute commercially. Second, they're really smart reasoning models on topics like coding , math , science , and paired with multi-lingual ability , this means that they're able to continue being smart on other topics like for example business , law and multi-lingual understanding. The 120B version can run on a single 80 GB GPU (consumer-ish grade , if you're a consumer that spends $12K on hardware for compute); the 20B fits on 16 GB VRAM (basically a gaming laptop). They have a long context of 128k tokens. And further quantizations can really make them quite small and useable. Transparent weights, hackable, and “own‑your‑stack” deployment that pairs beautifully with [SmolFactory’s workflows](# How It Works).
France
Here in France , we have a ton of talent. For example we have Meta's FAIR with an office here producing a ton of research. It's also the birthplace of literally huggingface
. You can't really talk about ai in France without at least mentioning Mistral
. By the way we host outstanding contributors like Maziyar Panahi. The government here is also on huggingface literally pushing open source datasets , and computing is a public good which is super cool. And yet , nothing much is happening . Government sponsored sovreign ai
initiatives are a disaster mostly made by webdevs from russia , and being toxic, misogenist and colonialist . Meanwhile the government has stopped maintaining their own leaderboard for french understanding probably because the models they sponsored are all on the bottom of the list - when they place at all. Meanwhile community models and initiatives continue to outpace any government-sponsorred project by leaps and bounds. Bright lights include Legml.ai clean dataset and french-speaking model
and open source french-speaking leaderboard and business understanding leaderboard.
What We're Trying To Do Here
- I was trying to :
- get my name and model on the leaderboards
- finetune a mode
- What this does :
- helps you train a model
- publish your results on huggingface
🤏🏻🏭SmolFactory
there's two ways to use 🤏🏻🏭SmolFactory :
- on the command line
- with a simple interface
Get Started
🤏🏻🏭SmolFactory works on any cloud provider but we'll show you how to use it with huggingface.
Get your huggingface tokens
- get tokens for
READ
andWRITE
permissions in your settings
- write these down somewhere , for example
Notepad
on windows
Duplicate the Interface
visit https://huggingface.co/spaces/Tonic/SmolFactory
SmolFactory is an end-to-end model maker and deployer on HuggingFacethe first thing you'll notice is the warning at the top of the site.
click on duplicate this space
and you'll find
add your READ and Write tokens in the space provided
Select A GPU
Before you click Deploy
you should select a GPU
L4s and A10Gs work great for SmolLM3 !
Click Deploy This Space
Once you've deployed the space with a GPU the warning will disappear and be replaced by a validation message .
Use the Interface
Now you can use the interface to train your model , deploy your training monitoring and upload and deploy your model !
configure your run in simple steps
Click on SmolLM3 to get Started- Click on SmolLM3 to get Started
- You might have to select
gpt-oss
at first to reveal the interface - just select
SmolLM3
again if that's the case
- You might have to select
Use Defaults
The interface provides defaults- The interface provides you with customizable defaults
- add your name in the provided field for your published model card
Click Run Pipeline
- And that's really about it , you should see the training commands get executed in the logs below
[add gif, or video]
Advanced Configuration
There's also a simple interface for advanced configuration of experiments where you can choose a few parameters. This isnt so complicated , but unless you really care , let's not get into it here.
What it Does
- Once you've clicked
Run Pipeline
Five Things Happen
SmolFactory Starts Your Training
" The training will startSmolFactory Tracks Your Training
The training will be tracked in a huggingface dataset- a dataset with all your training parameters and data will be created
- it will be updated live
SmolFactory Monitors Your Training with a Custom Monitor
The training will be monitored in a space- A custom/personal experiment tracker tracks your training
- To find it , navigate to your profile (dont close the SmolFactory Window)
- Click on the latest experiment (
first one in the list on your dashboard
) - you can use this on mobile to track your training on-the-go
SmolFactory Uploads Your Model (after the training is completed)
Your model is automatically UploadedSmolFactory Deploys A Demo Space
Your model is demo'ed on a huggingface Space- Use the interactive demo above
- SmolFactory automatically deploys a demo space
- You need to click on settings in the top right hand side of your space
- Then select
ZeroGPU
- this requires a Pro Account - That's it !
Congratulations
- You Just
- Pubished a Model
- Tracked Your Training
- Deployed a Demo on Huggingface !
Congratulations !
Now What ?
Well, now what ?
That's up to you , the world is your oyster !
- promote your model by writing a post
- Use your model via API or MCP
- Share your model with the world !
Your model is now available also via API and MCP !
Your model is now available also via API and MCP !- scroll to the bottom of your demo and click on
Use Via API or MCP
- you can already use your model in your app or your agents !
How it works
Entry Point
- Role: Orchestrates the run. Initializes configuration, model, data, and training logic.
- Starts: Validation, environment setup, and the end-to-end workflow.
Configuration Management
graph LR
Configuration_Management["Configuration Management"]
Training_Orchestration["Training Orchestration"]
Training_Orchestration -- "retrieves configuration from" --> Configuration_Management
click Configuration_Management href "https://github.com//Josephrp/SmolFactory/blob/main/SmolFactory/docs/blob/Configuration_Management.md" "Details"
- Purpose: Central source of truth for hyperparameters, paths, and runtime options.
- Guarantee: Validated, reproducible experiments via structured settings.
Model Abstraction
graph LR
EntryPoint["EntryPoint"]
Model_Abstraction["Model Abstraction"]
EntryPoint -- "initiates model loading in" --> Model_Abstraction
click Model_Abstraction href "https://github.com/Josephrp/SmolFactory/blob/main/docs/Model_Abstraction.md" "Details"
- What it does: Loads and prepares models (e.g., quantization, adapters) and the tokenizer.
- Why: Keeps training logic independent from specific architectures.
Data Pipeline
- Scope: Dataset loading, preprocessing/tokenization, splitting, and performant dataloaders.
- Output: Clean, batched data streams for training/eval.
Training Orchestrator
- Loop: Forward/backward passes, optimization, checkpointing, evaluation, callbacks.
- Integration: Works with acceleration libraries and logging/metrics collectors.
Real-time tracking and dataset flow
- Live monitor: A lightweight monitor collects metrics during training.
- Dual storage: Metrics flow to a Trackio Space API and a Hugging Face Dataset for history.
- Visualization: Web UI shows real-time curves and experiment parameters.
- Persistence: Final artifacts and logs remain accessible after runs.
What gets produced
- Local outputs: Checkpoints, configs, training summaries, logs.
- HF Dataset: Structured experiment records and time series of metrics.
- Model repo: Final model weights with a model card and metadata.
- Demo Space: Optional interactive app to try the model immediately.
End-to-end sequence
Step-by-step
- A. User starts pipeline: Entry point validates settings and environment.
- B. Auth & config: Tokens/env vars loaded; run is parameterized.
- C. Dataset repo setup: Creates/uses a dataset repo for experiment logging.
- D. Trackio deployment: Prepares a monitoring space endpoint for live updates.
- E. Training execution: Model trains; callbacks emit metrics and checkpoints.
- F. Real-time data flow: Metrics stream to Space + Dataset for visualization and history.
- G. Model push & cleanup: Upload final model; summarize outputs.
Track-Tonic
Monitoring, logging, and callbacks
Overview
- SmolLM3Monitor: Central object for experiment tracking with dual targets: Trackio Space and Hugging Face Datasets. Supports modes:
both
,dataset
,trackio
,none
. - Trainer callback: A custom
transformers.TrainerCallback
streams logs, checkpoints, eval results, and light system metrics. - TRL compatibility layer: A
trackio
module withinit/log/finish
mirrors TRL’s logging interface for drop-in usage.
See : SmolFactory
References: Transformers Trainer, Transformers Callbacks, TRL Logging.
Monitoring modes and configuration
- Env/args:
MONITORING_MODE
:both
|dataset
|trackio
|none
HF_TOKEN
: required for dataset writes and Trackio Space authTRACKIO_URL
orTRACKIO_SPACE_ID
: select the Trackio SpaceTRACKIO_DATASET_REPO
: HF dataset repo (defaulttonic/trackio-experiments
)TRACKIO_FLUSH_INTERVAL
: metrics batching frequency (default10
)
- Constructor:
SmolLM3Monitor(experiment_name, trackio_url, hf_token, dataset_repo, monitoring_mode, ...)
resolves env fallbacks.
Dataset persistence mechanics
- Merges by
experiment_id
, de‑dups metrics by(step, timestamp)
, and normalizes to nested entries:{timestamp, step, metrics: {...}}
. - Parameters are dict-merged; artifacts/logs are de‑duplicated while preserving order.
- Saves on a cadence via
TRACKIO_FLUSH_INTERVAL
and on close; final status and counts are recorded.
Trackio Space integration
- Validates space connectivity at init; gracefully degrades to dataset‑only if unavailable.
- Logs configuration, metrics, checkpoints, and summary via the
TrackioAPIClient
.
Transformers Trainer callback integration
- The monitor exposes
create_monitoring_callback()
which returns atransformers.TrainerCallback
implementing:on_log
: enriches logs with step time, throughput (if tokens available), token stats, optional token accuracy, then emits to backendson_save
: logs checkpoints when savedon_evaluate
: records eval metrics per stepon_train_begin
/on_train_end
: lifecycle markers and final flush
Example usage with Trainer:
from transformers import Trainer, TrainingArguments
monitor = SmolLM3Monitor(
experiment_name="smollm3_run",
trackio_url=os.getenv("TRACKIO_URL"),
hf_token=os.getenv("HF_TOKEN"),
dataset_repo=os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments"),
monitoring_mode=os.getenv("MONITORING_MODE", "both"),
)
trainer = Trainer(
model=model,
args=TrainingArguments(output_dir="./output"),
train_dataset=train_ds,
eval_dataset=eval_ds,
callbacks=[monitor.create_monitoring_callback()],
)
trainer.train()
See: Transformers Trainer, Callbacks API.
TRL‑compatible trackio interface
- The provided
trackio
module mirrors TRL’s logger with:init(project_name=None, experiment_name=None, ...) -> experiment_id
log(metrics: dict, step: Optional[int] = None)
finish()
- helpers:
log_config
,log_checkpoint
,log_evaluation_results
,get_experiment_url
,is_available
- To ensure a single shared experiment, you can hand the created monitor to the TRL wrapper:
import trackio
monitor = SmolLM3Monitor("smollm3_run", monitoring_mode="both")
trackio.set_monitor(monitor) # share the same experiment_id
trackio.init(experiment_name="smollm3_run")
# ... training loop (TRL or custom) calling trackio.log(...)
trackio.finish()
See: TRL Logging.
Emitted signals and artifacts
- Config:
log_configuration()
and JSON artifact copy - Metrics:
log_metrics()
with periodic dataset flush; light system metrics viapsutil
and CUDA stats when available - Checkpoints:
log_model_checkpoint()
records path/size and appends to artifacts - Evaluation:
log_evaluation_results()
per eval step - Summary:
log_training_summary()
with run duration and counts;close()
sets final status and persists any tail metrics
Hugging Face Datasets: how we persist metrics
- We keep one logical row per experiment update and union‑merge by
experiment_id
(non‑destructive). JSON fields are merged with de‑duplication; scalars are overwritten by the latest. This ensures history is preserved and new metrics are appended instead of replacing prior data. - Periodic flush: metrics buffered in memory are pushed every
TRACKIO_FLUSH_INTERVAL
steps and once again onclose()
. - The dataset split is
train
, private by default, and addressable with yourHF_TOKEN
.
Minimal write/append example:
import os, time
from datasets import Dataset, load_dataset
from huggingface_hub import create_repo
HF_TOKEN = os.getenv("HF_TOKEN")
REPO_ID = os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments")
def upsert_records(repo_id: str, hf_token: str, new_rows: list[dict]):
# Ensure repo exists (dataset type)
create_repo(repo_id, token=hf_token, repo_type="dataset", exist_ok=True)
try:
base = load_dataset(repo_id, split="train", token=hf_token)
rows = base.to_list()
rows.extend(new_rows)
ds = Dataset.from_list(rows)
except Exception:
ds = Dataset.from_list(new_rows)
# Push updated split (simple full overwrite strategy)
ds.push_to_hub(repo_id, token=hf_token, private=True)
# Example during training
for step in range(3):
metrics = {"timestamp": time.time(), "step": step, "metrics": {"loss": 0.1 * (3 - step)}}
upsert_records(REPO_ID, HF_TOKEN, [metrics])
Read and flatten nested metrics:
import pandas as pd
from datasets import load_dataset
def load_metrics_df(repo_id: str, hf_token: str) -> pd.DataFrame:
ds = load_dataset(repo_id, split="train", token=hf_token)
rows = []
for r in ds:
d = {"timestamp": r.get("timestamp"), "step": r.get("step")}
for k, v in (r.get("metrics") or {}).items():
if isinstance(v, (int, float)):
d[k] = v
rows.append(d)
return pd.DataFrame(rows).sort_values("step")
Gradio in monitoring
- Gradio is used as the live dashboard UI in a Space: it pulls the dataset, flattens nested metrics, and renders simple plots (e.g.,
loss
vsstep
). - An internal timer triggers a refresh (polling every few seconds). If a backend API is available, metrics can also be posted directly; otherwise, the UI reads from the dataset.
- The snippet below illustrates the core idea of “load → flatten → plot” :
import os
import pandas as pd
import gradio as gr
from datasets import load_dataset
REPO_ID = os.getenv("TRACKIO_DATASET_REPO", "tonic/trackio-experiments")
HF_TOKEN = os.getenv("HF_TOKEN")
def fetch_df() -> pd.DataFrame:
ds = load_dataset(REPO_ID, split="train", token=HF_TOKEN)
rows = []
for r in ds:
d = {"step": r.get("step")}
for k, v in (r.get("metrics") or {}).items():
if isinstance(v, (int, float)):
d[k] = v
rows.append(d)
if not rows:
return pd.DataFrame({"step": [], "loss": []})
return pd.DataFrame(rows).sort_values("step")
with gr.Blocks() as demo:
gr.Markdown("## Training metrics (auto-refresh)")
plot = gr.LinePlot(value=fetch_df(), x="step", y="loss")
gr.Timer(5.0, lambda: gr.LinePlot.update(value=fetch_df()))
Or: post metrics directly to a Space backend endpoint
import os, requests, time
SPACE_URL = os.getenv("TRACKIO_URL") # e.g. https://tonic-track-tonic.hf.space
HF_TOKEN = os.getenv("HF_TOKEN")
payload = {"experiment_id": "exp_123", "step": 42, "metrics": {"loss": 0.05}}
r = requests.post(
f"{SPACE_URL}/api/log_metrics",
headers={"authorization": f"Bearer {HF_TOKEN}"},
json=payload,
timeout=10,
)
r.raise_for_status()
Frequently Asked Questions
is this a unique idea ?
not at all ! there are many applications that currently do this :
I made SmolFactory to be simpler and cheaper than these alternatives
Why did you do this ?
- Just to share an easy way to train
- to share training recipes as we develop them
My training doesnt work , what is this bug ?
- Please report all the bugs to https://github.com/josephrp/smolfactory
I need help or I want to contribute
- Please join us on discord
- 🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻
Parting Thoughts
I originally started making this just for myself because i find SmolLM3 brilliant , if you have any thoughts about doing better with data and quantization , trainings and trainers , or want to share configs, we're really all ears !