AI & ML interests

None defined yet.

Recent Activity

jbilcke-hf 
posted an update 2 days ago
view post
Post
330
Did you know you can use AI-Toolkit by Ostris (https://github.com/ostris/ai-toolkit) to train AI image and video models directly inside a Hugging Face space?

The benefit of doing so is that you will get a nice UI, if you do not want to deal with JSON files and CLI shenanigans!

I have created a ready-to-use template you can deploy to your own HF Space to train generative models in a few clicks.

All you have to do is to duplicate my space to your private space by going here: jbilcke-hf/ai-toolkit

This space requires a good GPU and most importantly a persistent storage, as everything is stored in /data.

Currently multiple GPUs isn't supported yet in the UI but it is planned(https://discord.com/channels/1144033039786188891/1166417204015808603/1404851082361835623)

P.S.: Don't forget to set your space to private to make sure only you can access, delete and run training jobs!
louisbrulenaudet 
posted an update 7 days ago
view post
Post
5796
Supercharge Apple’s Shortcuts using Cloudflare Workers and Gemini within minutes (and for free, up to 1,500 requests per day) ☁️✨

Hello everyone, last week, while experimenting for fun, I created an API that allows you to easily access AI models (in this case, Google's) from the Shortcut app in order to analyze data from my apps and make the most of it thanks to the generative capabilities of advanced models.

It costs me nothing, and I think it might be good to share it so that others can build on it.

In README.md, you will find everything you need to get started and put your own microservice into production, which you can call from the app’s HTTP request features.

You will simply be asked to have a free Cloudflare account and an API key obtained from Google's AI Studio.

Feel free to take a look and get back to me if you encounter any problems during deployment.

Here is the GitHub repo where you can find all the source code and run it on your own: https://github.com/louisbrulenaudet/genai-api
merve 
posted an update 7 days ago
view post
Post
5987
large AI labs have dropped so many open models last week 🔥 don't miss out on them

→ Apple released on-device vision LMs apple/fastvlm-68ac97b9cd5cacefdd04872e & apple/mobileclip2-68ac947dcb035c54bcd20c47
→ OpenGVLab released InternVL3.5, 32 new vision LMs with one based on gpt-oss! (OS) OpenGVLab/internvl35-68ac87bd52ebe953485927fb
→ MSFT released a killer small TTS model (OS) microsoft/VibeVoice-1.5B

find more herehttps://huggingface.co/collections/merve/august-29-releases-68b5a3754cfb8abf59e2b486
  • 1 reply
·
louisbrulenaudet 
posted an update 8 days ago
view post
Post
371
Although more and more code editors are aligning themselves with the AGENTS.md file standard, some still use specific nomenclatures that can make it difficult to maintain different configuration files when several people are working on the same project with different agents.

Bodyboard addresses this by generating canonical instructions for code helpers from a single AGENTS.md file, thereby streamlining the production of adapter outputs for Gemini CLI, Copilot, Cline, Claude, Rules, Windsurf, and OpenAI Codex integrations.

You just have to:
npm install -g bodyboard

Then run, at the root of your project:
bodyboard all

Link to npm: https://www.npmjs.com/package/bodyboard
Link to the GitHub repo: https://github.com/louisbrulenaudet/bodyboard

It's a very simple project, but it addresses certain issues I've encountered, so why not make it available to everyone...

If you have other ideas for adapters to create, feel free to open a PR on the GitHub repo.
merve 
posted an update 13 days ago
fdaudens 
posted an update 25 days ago
view post
Post
5832
Want to learn to build an AI Agent? I put together a cookbook for creating your own news research agent with OpenAI GPT-OSS:

- Searches headlines & specific sites
- Pulls full articles when you need depth
- Summarizes with clickable sources
- Runs in a simple Gradio chat UI
- No GPU, no local setup — just open-weight GPT-OSS models via Hugging Face

If you’ve been wanting to try agents but weren’t sure where to start, this is an end-to-end example you can fork, run, and adapt.

Full guide + code https://huggingface.co/blog/fdaudens/openai-gpt-oss-agent-inference-providers
  • 2 replies
·
nroggendorff 
posted an update 26 days ago
view post
Post
4596
No, I did not create those bots that just got banned today.
·
fdaudens 
posted an update 27 days ago
view post
Post
510
What can OpenAI’s new open models do with the news? I built a News Agent to find out.

It can answer questions about the news in real time, and every answer comes with original source links so you can dive deeper.

Ask it things like:
- "What are the top news stories today?"
- "What's the latest on artificial intelligence?"
- Follow-up questions on specific stories

Runs with Hugging Face inference providers, letting you compare results from the OpenAI 20B and 120B models

So far, I’m quite impressed by the capabilities of even the smaller 20B model. Definitely not a perfect project, but curious to hear your thoughts!

fdaudens/gpt-oss-news-agent
  • 2 replies
·
BrigitteTousi 
posted an update 28 days ago
fdaudens 
posted an update 28 days ago
view post
Post
3374
OpenAI’s GPT-OSS has sparked ~400 new models on Hugging Face and racked up 5M downloads in less than a week, already outpacing DeepSeek R1’s first-week numbers.

For comparison: when R1 launched, I tracked 550 derivatives (across 8 base models) in a week, with ~3M downloads. GPT-OSS is ahead on adoption and engagement.

It’s also the most-liked release of any major LLM this summer. The 20B and 120B versions quickly shot past Kimi K2, GLM 4.5, and others in likes.

Most-downloaded GPT-OSS models include LM Studio and Unsloth AI versions:
1️⃣ openai/gpt-oss-20b - 2.0M
2️⃣ lmstudio-community/gpt-oss-20b-MLX-8bit - 750K
3️⃣ openai/gpt-oss-120b - 430K
4️⃣ unsloth/gpt-oss-20b-GGUF - 380K
5️⃣ lmstudio-community/gpt-oss-20b-GGUF - 330K

The 20B version is clearly finding its audience, showing the power of smaller, faster, more memory- and energy-efficient models. (These numbers don’t include calls to the models via inference providers, so the real usage is likely even bigger, especially for the 120B version)

Open-weight models let anyone build on top. Empower the builders, and innovation takes off. 🚀
  • 1 reply
·
BrigitteTousi 
posted an update about 1 month ago
view post
Post
524
New interactive viz from AI World showing OpenAI's new open model gpt-oss-120b breaking into the top 50 most liked models of all time on the Hub in under a day! ☄️☄️☄️
merve 
posted an update about 1 month ago
view post
Post
3245
GPT-4.1-mini level model right in your iPhone 🤯

openbmb/MiniCPM-V-4 is only 4B while surpassing GPT-4.1-mini in vision benchmarks 🔥

allows commercial use as well!
fdaudens 
posted an update about 1 month ago
view post
Post
2634
Well, it took just 2 hours for openai/gpt-oss-120b to hit #1 on Hugging Face. Don’t remember seeing anything rise that fast!
  • 1 reply
·
alielfilali01 
posted an update about 1 month ago
KingNish 
posted an update about 1 month ago
merve 
posted an update about 1 month ago
view post
Post
1130
we're all sleeping on this OCR model rednote-hilab/dots.ocr 🔥

dots.ocr is a new 3B model with sota performance, support for 100 languages & allowing commercial use! 🤯

single e2e model to extract image, convert tables, formula, and more into markdown 📝
try it MohamedRashad/Dots-OCR
merve 
posted an update about 1 month ago
view post
Post
662
massive releases and tons of Flux 1. Krea LoRas past week!
here's some of the picks, find more models in collection 🫡 merve/releases-august-2-6890c14248203522b7d0267f

LLMs 💬
> Tencent dropped tencent/Hunyuan-7B-Instruct
> Qwen released Qwen/Qwen3-Coder-30B-A3B-Instruct, 30B MoE with 3B params for coding (OS)

vision/multimodal
> RedNote released rednote-hilab/dots.ocr - 3B OCR model (OS)
> Cohere released CohereLabs/command-a-vision-07-2025 - 112B (dense!) VLM for 6 languages
> StepFun-AI shipped stepfun-ai/step3 - 321B MoE VLM (OS)
> Skywork shipped Skywork/Skywork-UniPic-1.5B - new any-to-any model (image+text → image+text) (OS)
Abhaykoul 
posted an update about 1 month ago
view post
Post
3969
🚀 Dhanishtha-2.0-preview-0825 Is Here

The Intermediate Thinking Model just leveled up again.

With sharper reasoning, better tool use, and expanded capabilities, Dhanishtha-2.0-preview-0825 is now live and ready to impress.

🧠 What Makes Dhanishtha Special?
Unlike typical CoT models that only thinks one time, Dhanishtha thinks iteratively:

> Think → Answer → Rethink → Improve → Rethink again if needed.

🔗 Try it now: HelpingAI/Dhanishtha-2.0-preview-0825

🔞 Dhanishtha NSFW Preview

For those exploring more expressive and immersive roleplay scenarios, we’re also releasing:

HelpingAI/Dhanishtha-nsfw
A specialized version tuned for adult-themed interactions and character-driven roleplay.

🔗 Explore it here: HelpingAI/Dhanishtha-nsfw

💬 You can also try all of these live at chat.helpingai.co
·