Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

codelion 
posted an update 1 day ago
view post
Post
2647
Released 17 production-ready adaptive text classifiers that learn from just 100 examples per class and continuously improve without retraining.

These models achieve 93% average accuracy across enterprise use cases like email routing, fraud detection, document classification, and support ticket categorization. Built on ModernBERT with prototype memory and elastic weight consolidation.

Key benefits: 90% cost reduction vs API solutions, 90-120ms local inference, dynamic class addition, and zero vendor lock-in.

All models available under adaptive-classifier organization. Install with pip install adaptive-classifier.

Full technical details: https://huggingface.co/blog/codelion/enterprise-ready-classifiers
Code: https://github.com/codelion/adaptive-classifier
  • 2 replies
·
prithivMLmods 
posted an update 1 day ago
view post
Post
2686
On the verge of releasing Poseidon-Reasoning-5M, a dataset built to excel in general thought processes, mathematics, and science across a diverse mixture of domains, I’m also dropping the Gargantua-R1-Compact dataset, a collection of over six million high-quality reasoning QA pair traces. 🤗🚀

✦ Gargantua-R1-Compact : prithivMLmods/Gargantua-R1-Compact

from datasets import load_dataset

dataset = load_dataset("prithivMLmods/Gargantua-R1-Compact", split="train")

Additionally, I’m adding the mini version of Gargantua — the Gargantua-R1-Wee : prithivMLmods/Gargantua-R1-Wee

from datasets import load_dataset

dataset = load_dataset("prithivMLmods/Gargantua-R1-Wee", split="train")

The composition spans 73.93% core mathematical reasoning involving problems, proofs, and computational challenges, 12.11% across diverse scientific domains such as physics, chemistry, biology, and interdisciplinary topics, 11.35% in competitive coding covering algorithms and data structures, 1.37% in academic science focusing on research-level methodology, 0.95% in creative and analytical reasoning through logic puzzles and problem-solving tasks, 0.25% in specialized technical areas like MLOps, LLMs, diffusion models, and CUDA, and 0.06% involving data from graphs and charts converted into structured JSON formats. Designed with both rich contextual depth and formal structural clarity, Gargantua-R1-Compact is an optimal resource for advancing research in symbolic reasoning, interpretability, and high-precision question answering in mathematical domains.

✦ Collection : prithivMLmods/gargantua-r1-mod-6896bfd7834e82b89ad2b38b


To know more about it, visit the dataset card of the respective dataset. !!
ovi054 
posted an update 3 days ago
view post
Post
2251
Qwen Image + LoRA ⚡

ovi054/Qwen-Image-LORA

Qwen Image is the No. 1 trending Text-to-Image model right now. You can add a custom LoRA and generate images with this Space.

👉 Try it now: ovi054/Qwen-Image-LORA

  • 8 replies
·
FlameF0X 
posted an update about 12 hours ago
view post
Post
253
I am very sad to say that the budget in creating of SnowflakeCore-G1 1b and 7b MoE models ran out and I can't pre-train them anymore.
salma-remyx 
posted an update 3 days ago
view post
Post
2395
𝗣𝗮𝗽𝗲𝗿𝟮𝗣𝗥𝘀
Lately, we've been experimenting with recommending arXiv papers based on the context of what we're building in AI.
At the same time, we're using an agent to help automate the building and testing of Docker Images.

Check out the example here:
https://hub.docker.com/repository/docker/remyxai/2507.20613v1/general

Next, we're tasking our #ExperimentOps agent to open PRs in a target repo, to evaluate the core concepts from a new research paper in the context of your application and your kpis.

Operationalize your Experimentation!
Find Your Frontier!
#BeAnExperimenter
  • 1 reply
·
sweatSmile 
posted an update 3 days ago
view post
Post
2570
Teaching a 7B Model to Be Just the Right Amount of Snark

Ever wondered if a language model could get sarcasm? I fine-tuned Mistral-7B using LoRA and 4-bit quantisation—on just ~720 hand-picked sarcastic prompt–response pairs from Reddit, Twitter, and real-life conversations.

The challenge? Keeping it sarcastic but still helpful.

LoRA rank 16 to avoid overfitting

4-bit NF4 quantization to fit on limited GPU memory

10 carefully monitored epochs so it didn’t turn into a full-time comedian

Result: a model that understands “Oh great, another meeting” exactly as you mean it.

Read the full journey, tech details, and lessons learned on my blog:
Fine-Tuning Mistral-7B for Sarcasm with LoRA and 4-Bit Quantisation

Try the model here on Hugging Face: sweatSmile/Mistral-7B-Instruct-v0.1-Sarcasm.

MonsterMMORPG 
posted an update about 13 hours ago
view post
Post
330
https://youtu.be/R6h02YY6gUs

Qwen Image is literally unchallenged at understanding complex prompts and writing amazing text on generated images. This model feels almost as if it’s illegal to be open source and free. It is my new tool for generating thumbnail images. Even with low-effort prompting, the results are excellent.
This tutorial literally shows how these images were generated with Gemini 2.5 Pro made prompts :
Qwen Image Dominates Text-to-Image: 700+ Tests Reveal Why It’s Better Than FLUX — Presets Published
https://youtu.be/R6h02YY6gUs

Gemini 2.5 Pro is freely available on Google Studio AI
All images generated in easy to use SwarmUI and they are unmodified raw generations
SwarmUI and ComfyUI install tutorial :
Master Local AI Art & Video Generation with SwarmUI (ComfyUI Backend): The Ultimate 2025 Tutorial
https://www.youtube.com/watch?v=fTzlQ0tjxj0

Kseniase 
posted an update about 21 hours ago
view post
Post
378
6 Must-read books about AI and Machine Learning:

Sharing some free, useful resources for you. In this collection, we’ve gathered the most recent books to give you up-to-date information on key fundamental topics. Hope this helps you master AI and machine learning:

1. Machine Learning Systems by Vijay Janapa Reddi → https://www.mlsysbook.ai/
Provides a framework for building effective ML solutions, covering data engineering, optimization, hardware-aware training, inference acceleration, architecture choice, and other key principles

2. Generative Diffusion Modeling: A Practical Handbook by Zihan Ding, Chi Jin → https://arxiv.org/abs/2412.17162
Offers a unified view of diffusion models: probabilistic, score-based, consistency, rectified flow, pre/post-training. It aligns notations with code to close the “paper-to-code” gap.

3. Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges → https://arxiv.org/abs/2104.13478
Explores unified geometric principles to analyze neural networks' architectures: CNNs, RNNs, GNNs, Transformers, and guide the design of the future ones

4. Mathematical Foundations of Geometric Deep Learning by Haitz Saez de Ocariz Borde and Michael Bronstein → https://arxiv.org/abs/2508.02723
Dives into the the key math concepts behind geometric Deep Learning: geometric and analytical structures, vector calculus, differential geometry, etc.

5. Interpretable Machine Learning by Christoph Molnar → https://github.com/christophM/interpretable-ml-book
Practical guide to simple, transparent models (e.g., decision trees) and model-agnostic methods like LIME, Shapley values, permutation importance, and accumulated local effects.

6. Understanding Deep Learning by Simon J.D. Prince → https://udlbook.github.io/udlbook/
Explores core deep learning concenpts: models, training, evaluation, RL, architectures for images, text, and graphs, addressing open theoretical questions

Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe
dhruv3006 
posted an update 3 days ago
view post
Post
1812
GPT 5 for Computer Use agents.

Same tasks, same grounding model we just swapped GPT 4o with GPT 5 as the thinking model.

Left = 4o, right = 5.

Watch GPT 5 pull away.

Reasoning model: OpenAI GPT-5

Grounding model: Salesforce GTA1-7B

Action space: CUA Cloud Instances (macOS/Linux/Windows)


The task is: "Navigate to {random_url} and play the game until you reach a score of 5/5”....each task is set up by having claude generate a random app from a predefined list of prompts (multiple choice trivia, form filling, or color matching)"


Try it yourself here : https://github.com/trycua/cua

Docs : https://docs.trycua.com/docs/agent-sdk/supported-agents/composed-agents
  • 1 reply
·
mitkox 
posted an update 4 days ago
view post
Post
1570
Earlier today, humanity faced a critical threat from a catastrophic chart crime. I asked my local Qwen3 Coder Flash to fix it. Sleep well, fellow humans. The visualization singularity is now high, and it runs with zero warnings.
  • 2 replies
·