AI & ML interests

None defined yet.

Nymbo 
posted an update 1 day ago
view post
Post
225
I have a few updates to my MCP server I wanna share: New Memory tool, improvements to web search & speech generation.

# Memory_Manager Tool

We now have a Memory_Manager tool. Ask ChatGPT to write all its memories verbatim, then tell gpt-oss-20b to save each one using the tool, then take them anywhere! It stores memories in a memories.json file in the repo, no external database required.

The Memory_Manager tool is currently hidden from the HF space because it's intended for local use. It's enabled by providing a HF_READ_TOKEN in the env secrets, although it doesn't actually use the key for anything. There's probably a cleaner way of ensuring memory is only used locally, I'll come back to this.

# Fetch & Websearch

The Fetch_Webpage tool has been simplified a lot. It now converts the page to Markdown and returns the page with three length settings (Brief, Standard, Full). This is a lot more reliable than the old custom extraction method.

The Search_DuckDuckGo tool has a few small improvements. The input is easier for small models to get right, and the output is more readable.

# Speech Generation

I've added the remaining voices for Kokoro-82M, it now supports all 54 voices with all accents/languages.

I also removed the 30 second cap by making sure it computes all chunks in sequence, not just the first. I've tested it on outputs that are ~10 minutes long. Do note that when used as an MCP server, the tool will timeout after 1 minute, nothing I can do about that for right now.

# Other Thoughts

Lots of MCP use cases involve manipulating media (image editing, ASR, etc.). I've avoided adding tools like this so far for two reasons:

1. Most of these solutions would require assigning it a ZeroGPU slot.
2. The current process of uploading files like images to a Gradio space is still a bit rough. It's doable but requires additional tools.

Both of these points make it a bit painful for local usage. I'm open to suggestions for other tools that rely on text.
Tonic 
posted an update 3 days ago
view post
Post
289
🙋🏻‍♂️ Hey there folks ,

Just wanted to annouce 🏭SmolFactory : it's the quickest and best way to finetune SmolLM3 and GPT-OSS-20B on huggingface !

Basicaly it's an app you can run on huggingface by duplicating the space and running your training directly on huggingface GPUs .

It will help you basically select datasets and models, fine tune your model , make an experiment tracker you can use on your mobile phone , push all your model card and even automatically make a demo for you on huggingface so you can directly test it out when it's done !

check out the blog to learn more : https://huggingface.co/blog/Tonic/smolfactory

or just try the app directly :
Tonic/SmolFactory

you can vibe check the cool models I made :
French SmolLM3 : Tonic/Petite-LLM-3
Medical GPT-OSS : Tonic/med-gpt-oss-20b-demo

check out the model cards :
multilingual reasoner (gpt-oss) - Tonic/gpt-oss-20b-multilingual-reasoner
med-gpt-oss : Tonic/med-gpt-oss-20b
petite-elle-l-aime : Tonic/petite-elle-L-aime-3-sft

github repo if you like command line more than gradio : https://github.com/josephrp/smolfactory

drop some likes on these links it's really much appreciated !

feedback and PRs are welcome !
hannayukhymenko 
posted an update 5 days ago
view post
Post
2522
Releasing the Jupyter Agent Dataset! 🚀

Built from 7 TB of real Kaggle datasets + 20k notebooks, creating real code exec traces using Qwen3-Coder and E2B.
Training on this data dramatically improves the ability to execute code and analyze data.

We ( @baptistecolle @hannayukhymenko @lvwerra ) have created a novel synthetic data generation pipeline with efficient scaffolding, which gives a big performance boost after training your coding agent🔥With the help of real Kaggle notebooks and datasets we generate synthetic notebooks which aim to analyze datasets and answer factual questions about them more efficiently. We simulate a real code execution environment by prompting LLMs or with the help of E2B sandboxes. We have built a dataset of 50k+ high-quality LLM-generated notebooks which can help your agent become better at performing data analysis and question answering.

Link: data-agents/jupyter-agent-dataset
  • 3 replies
·
Nymbo 
posted an update 14 days ago
view post
Post
741
I built a general use MCP space ~ Fetch webpages, DuckDuckGo search, Python code execution, Kokoro TTS, Image Gen, Video Gen.

# Tools

1. Fetch webpage
2. Web search via DuckDuckGo (very concise, low excess context)
3. Python code executor
4. Kokoro-82M speech generation
5. Image Generation (use any model from HF Inference Providers)
6. Video Generation (use any model from HF Inference Providers)

The first four tools can be used without any API keys whatsoever. DDG search is free and the code execution and speech gen is done on CPU. Having a HF_READ_TOKEN in the env variables will show all tools. If there isn't a key present, The Image/Video Gen tools are hidden.

Nymbo/Tools
Nymbo 
posted an update 22 days ago
view post
Post
964
Anyone using Jan-v1-4B for local MCP-based web search, I highly recommend you try out Intelligent-Internet/II-Search-4B

Very impressed with this lil guy and it deserves more downloads. It's based on the original version of Qwen3-4B but find that it questions reality way less often. Jan-v1 seems to think that everything it sees is synthetic data and constantly gaslights me
Tonic 
posted an update about 1 month ago
Abhaykoul 
posted an update about 1 month ago
view post
Post
3967
🚀 Dhanishtha-2.0-preview-0825 Is Here

The Intermediate Thinking Model just leveled up again.

With sharper reasoning, better tool use, and expanded capabilities, Dhanishtha-2.0-preview-0825 is now live and ready to impress.

🧠 What Makes Dhanishtha Special?
Unlike typical CoT models that only thinks one time, Dhanishtha thinks iteratively:

> Think → Answer → Rethink → Improve → Rethink again if needed.

🔗 Try it now: HelpingAI/Dhanishtha-2.0-preview-0825

🔞 Dhanishtha NSFW Preview

For those exploring more expressive and immersive roleplay scenarios, we’re also releasing:

HelpingAI/Dhanishtha-nsfw
A specialized version tuned for adult-themed interactions and character-driven roleplay.

🔗 Explore it here: HelpingAI/Dhanishtha-nsfw

💬 You can also try all of these live at chat.helpingai.co
·
Tonic 
posted an update about 2 months ago
view post
Post
758
👋 Hey there folks,

just submitted my plugin idea to the G-Assist Plugin Hackathon by @nvidia . Check it out, it's a great way to use a local SLA model on a windows machine to easily and locally get things done ! https://github.com/NVIDIA/G-Assist
Tonic 
posted an update about 2 months ago
view post
Post
586
🙋🏻‍♂️ Hey there folks ,

Yesterday , Nvidia released a reasoning model that beats o3 on science, math and coding !

Today you can try it out here : Tonic/Nvidia-OpenReasoning

hope you like it !
Abhaykoul 
posted an update about 2 months ago
view post
Post
3081
🎉 Dhanishtha-2.0-preview-0725 is Now Live

The Intermediate Thinking Model just got even better.
With the new update, Dhanishtha is now sharper, smarter, and trained further on tool use

🧠 What Makes Dhanishtha Different?
Unlike standard COT models that give one-shot responses, Dhanishtha thinks in layers:

> Think → Answer → Rethink → Improve → Rethink again if needed.

HelpingAI/Dhanishtha-2.0-preview-0725
Tonic 
posted an update about 2 months ago
view post
Post
3337
🙋🏻‍♂️ Normalize adding compute & runtime traces to your model cards
  • 2 replies
·
Tonic 
posted an update 2 months ago
view post
Post
512
Who's going to Raise Summit in Paris Tomorrow ?

If you're around , I would love to meet you :-)
Abhaykoul 
posted an update 2 months ago
view post
Post
3041
🎉 Dhanishtha 2.0 Preview is Now Open Source!

The world's first Intermediate Thinking Model is now available to everyone!

Dhanishtha 2.0 Preview brings revolutionary intermediate thinking capabilities to the open-source community. Unlike traditional reasoning models that think once, Dhanishtha can think, answer, rethink, answer again, and continue rethinking as needed using multiple blocks between responses.

🚀 Key Features
- Intermediate thinking: Think → Answer → Rethink → Answer → Rethink if needed...
- Token efficient: Uses up to 79% fewer tokens than DeepSeek R1 on similar queries
- Transparent thinking: See the model's reasoning process in real-time
- Open source: Freely available for research and development


HelpingAI/Dhanishtha-2.0-preview
https://helpingai.co/chat
  • 1 reply
·
Nymbo 
posted an update 2 months ago
view post
Post
2831
Anyone know how to reset Claude web's MCP config? I connected mine when the HF MCP first released with just the default example spaces added. I added lots of other MCP spaces but Claude.ai doesn't update the available tools... "Disconnecting" the HF integration does nothing, deleting it and adding it again does nothing.

Refreshing tools works fine in VS Code because I can manually restart it in mcp.json, but claude.ai has no such option. Anyone got any ideas?
·
Abhaykoul 
posted an update 2 months ago
view post
Post
4520
Introducing Dhanishtha 2.0: World's first Intermediate Thinking Model

Dhanishtha 2.0 is the world's first LLM designed to think between the responses. Unlike other Reasoning LLMs, which think just once.

Dhanishtha can think, rethink, self-evaluate, and refine in between responses using multiple <think> blocks.
This technique makes it Hinghlt Token efficient it Uses up to 79% fewer tokens than DeepSeek R1
---

You can try our model from: https://helpingai.co/chat
Also, we're gonna Open-Source Dhanistha on July 1st.

---
For Devs:
🔑 Get your API key at https://helpingai.co/dashboard
from HelpingAI import HAI  # pip install HelpingAI==1.1.1
from rich import print

hai = HAI(api_key="hl-***********************")

response = hai.chat.completions.create(
    model="Dhanishtha-2.0-preview",
    messages=[{"role": "user", "content": "What is the value of ∫0∞𝑥3/𝑥−1𝑑𝑥 ?"}],
    stream=True,
    hide_think=False # Hide or show models thinking
)

for chunk in response:
    print(chunk.choices[0].delta.content, end="", flush=True)
  • 2 replies
·
Tonic 
posted an update 3 months ago
view post
Post
691
🙋🏻‍♂️ hey there folks ,

So every bio/med/chem meeting i go to i always the same questions "why are you sharing a gdrive link with me for this?" and "Do you have any plans to publish your model weights and datasets on huggingface?" and finally i got a good answer today which explains everything :

basically there is some kind of government censorship on this (usa, but i'm sure others too) and they are told they are not allowed as it is considered a "dataleak" which is illegal !!!!

this is terrible ! but the good news is that we can do something about it !

so there is this "call for opinions and comments" here from the NIH (usa) , and here we can make our opinion on this topic known : https://osp.od.nih.gov/comment-form-responsibly-developing-and-sharing-generative-artificial-intelligence-tools-using-nih-controlled-access-data/

kindly consider dropping your opinion and thoughts about this censorship of science , and share this post , link or thoughts widely .

Together maybe we can start to share data and model weights appropriately and openly in a good way 🙏🏻🚀

cc. @cyrilzakka

Tonic 
posted an update 3 months ago
view post
Post
2546
🙋🏻‍♂️ Hey there folks ,

Yesterday the world's first "Learn to Vibe Code" application was released .

As vibe coding is the mainstream paradigm , so now the first educational app is there to support it .

You can try it out already :

https://vibe.takara.ai

and of course it's entirely open source, so i already made my issue and feature branch :-) 🚀
Nymbo 
posted an update 4 months ago
view post
Post
4097
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?
  • 1 reply
·