Thแบญt tuyแปt vแปi!
Xuan-Son Nguyen
ngxson
AI & ML interests
Doing AI for fun, not for profit
Recent Activity
upvoted
a
changelog
1 day ago
Xet is now the default storage option for new users and organizations
upvoted
a
changelog
1 day ago
Static Spaces can now have a build step
updated
a model
2 days ago
ggml-org/ultravox-v0_5-llama-3_1-8b-GGUF
Organizations
ngxson's activity

reacted to
loubnabnl's
post with ๐โค๏ธ
8 days ago
Post
2464
SmolVLM is now available on PocketPal โ you can run it offline on your smartphone to interpret the world around you. ๐๐ฑ
And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai
And check out this real-time camera demo by @ngxson , powered by llama.cpp:
https://github.com/ngxson/smolvlm-realtime-webcam
https://x.com/pocketpal_ai
For around 80 euros I can by a Raspberry Pi 4 kit, so I would expect a robot kit to be the same

reacted to
clem's
post with โค๏ธ๐ฅ
3 months ago
Post
7343
I was chatting with
@peakji
, one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).
He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.
As a thank you to the community, he shared 100 invite code first-come first serve, just use โHUGGINGFACEโ to get access!
He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.
As a thank you to the community, he shared 100 invite code first-come first serve, just use โHUGGINGFACEโ to get access!

posted
an
update
3 months ago
Post
3939
A comprehensive matrix for which format should you use.
Read more on my blog post: https://huggingface.co/blog/ngxson/common-ai-model-formats
Read more on my blog post: https://huggingface.co/blog/ngxson/common-ai-model-formats
| Hardware | GGUF | PyTorch | Safetensors | ONNX |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU | โ
(best) | ๐ก | ๐ก | โ
|
| GPU | โ
| โ
| โ
| โ
|
| Mobile | โ
| ๐ก (via executorch) | โ | โ
|
| Apple silicon | โ
| ๐ก | โ
(via MLX framework) | โ
|

reacted to
fdaudens's
post with ๐ฅโค๏ธ๐
3 months ago
Post
3364
๐ Just launched: A toolkit of 20 powerful AI tools that journalists can use right now - transcribe, analyze, create. 100% free & open-source.
Been testing all these tools myself and created a searchable collection of the most practical ones - from audio transcription to image generation to document analysis. No coding needed, no expensive subscriptions.
Some highlights I've tested personally:
- Private, on-device transcription with speaker ID in 100+ languages using Whisper
- Website scraping that just works - paste a URL, get structured data
- Local image editing with tools like Finegrain (impressive results)
- Document chat using Qwen 2.5 72B (handles technical papers well)
Sharing this early because the best tools come from the community. Drop your favorite tools in the comments or join the discussion on what to add next!
๐ JournalistsonHF/ai-toolkit
Been testing all these tools myself and created a searchable collection of the most practical ones - from audio transcription to image generation to document analysis. No coding needed, no expensive subscriptions.
Some highlights I've tested personally:
- Private, on-device transcription with speaker ID in 100+ languages using Whisper
- Website scraping that just works - paste a URL, get structured data
- Local image editing with tools like Finegrain (impressive results)
- Document chat using Qwen 2.5 72B (handles technical papers well)
Sharing this early because the best tools come from the community. Drop your favorite tools in the comments or join the discussion on what to add next!
๐ JournalistsonHF/ai-toolkit

reacted to
as-cle-bert's
post with ๐๐
3 months ago
Post
2411
I built an AI agent app in less than 8 hours๐คฏ
And, believe me, this is ๐ป๐ผ๐ clickbaitโ
GitHub ๐ https://github.com/AstraBert/PapersChat
Demo ๐ as-cle-bert/PapersChat
The app is called ๐๐๐ฉ๐๐ซ๐ฌ๐๐ก๐๐ญ, and it is aimed at ๐บ๐ฎ๐ธ๐ถ๐ป๐ด ๐ฐ๐ต๐ฎ๐๐๐ถ๐ป๐ด ๐๐ถ๐๐ต ๐๐ฐ๐ถ๐ฒ๐ป๐๐ถ๐ณ๐ถ๐ฐ ๐ฝ๐ฎ๐ฝ๐ฒ๐ฟ๐ ๐ฒ๐ฎ๐๐ถ๐ฒ๐ฟ.
๐๐๐ซ๐ ๐ข๐ฌ ๐ฐ๐ก๐๐ญ ๐ญ๐ก๐ ๐๐ฉ๐ฉ ๐๐จ๐๐ฌ:
๐ Parses the papers that you upload thanks to LlamaIndex๐ฆ (either with LlamaParse or with simpler, local methods)
๐ Embeds documents both with a sparse and with a dense encoder to enable hybrid search
๐ Uploads the embeddings to Qdrant
โ๏ธ Activates an Agent based on mistralai/Mistral-Small-24B-Instruct-2501 that will reply to your prompt
๐ง Retrieves information relevant to your question from the documents
๐ง If no relevant information is found, it searches PubMed and arXiv databases
๐ง Returns a grounded answer to your prompt
๐๐จ๐ฐ ๐๐ข๐ ๐ ๐ฆ๐๐ง๐๐ ๐ ๐ญ๐จ ๐ฆ๐๐ค๐ ๐ญ๐ก๐ข๐ฌ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐ข๐ง ๐ ๐ก๐จ๐ฎ๐ซ๐ฌ?
Three key points:
- LlamaIndex๐ฆ provides countless integrations with LLM providers, text embedding models and vectorstore services, and takes care of the internal architecture of the Agent. You just plug it in, and it works!๐โก
- Qdrant is a vector database service extremely easy to set up and use: you just need a one-line Docker command๐
- Gradio makes frontend development painless and fast, while still providing modern and responsive interfaces๐๏ธ
And a bonus point:
- Deploying the demo app couldn't be easier if you use Gradio-based Hugging Face Spaces๐ค
So, no more excuses: build your own AI agent today and do it fast, (almost) for free and effortlessly๐
And if you need a starting point, the code for PapersChat is open and fully reproducible on GitHub ๐ https://github.com/AstraBert/PapersChat
And, believe me, this is ๐ป๐ผ๐ clickbaitโ
GitHub ๐ https://github.com/AstraBert/PapersChat
Demo ๐ as-cle-bert/PapersChat
The app is called ๐๐๐ฉ๐๐ซ๐ฌ๐๐ก๐๐ญ, and it is aimed at ๐บ๐ฎ๐ธ๐ถ๐ป๐ด ๐ฐ๐ต๐ฎ๐๐๐ถ๐ป๐ด ๐๐ถ๐๐ต ๐๐ฐ๐ถ๐ฒ๐ป๐๐ถ๐ณ๐ถ๐ฐ ๐ฝ๐ฎ๐ฝ๐ฒ๐ฟ๐ ๐ฒ๐ฎ๐๐ถ๐ฒ๐ฟ.
๐๐๐ซ๐ ๐ข๐ฌ ๐ฐ๐ก๐๐ญ ๐ญ๐ก๐ ๐๐ฉ๐ฉ ๐๐จ๐๐ฌ:
๐ Parses the papers that you upload thanks to LlamaIndex๐ฆ (either with LlamaParse or with simpler, local methods)
๐ Embeds documents both with a sparse and with a dense encoder to enable hybrid search
๐ Uploads the embeddings to Qdrant
โ๏ธ Activates an Agent based on mistralai/Mistral-Small-24B-Instruct-2501 that will reply to your prompt
๐ง Retrieves information relevant to your question from the documents
๐ง If no relevant information is found, it searches PubMed and arXiv databases
๐ง Returns a grounded answer to your prompt
๐๐จ๐ฐ ๐๐ข๐ ๐ ๐ฆ๐๐ง๐๐ ๐ ๐ญ๐จ ๐ฆ๐๐ค๐ ๐ญ๐ก๐ข๐ฌ ๐๐ฉ๐ฉ๐ฅ๐ข๐๐๐ญ๐ข๐จ๐ง ๐ข๐ง ๐ ๐ก๐จ๐ฎ๐ซ๐ฌ?
Three key points:
- LlamaIndex๐ฆ provides countless integrations with LLM providers, text embedding models and vectorstore services, and takes care of the internal architecture of the Agent. You just plug it in, and it works!๐โก
- Qdrant is a vector database service extremely easy to set up and use: you just need a one-line Docker command๐
- Gradio makes frontend development painless and fast, while still providing modern and responsive interfaces๐๏ธ
And a bonus point:
- Deploying the demo app couldn't be easier if you use Gradio-based Hugging Face Spaces๐ค
So, no more excuses: build your own AI agent today and do it fast, (almost) for free and effortlessly๐
And if you need a starting point, the code for PapersChat is open and fully reproducible on GitHub ๐ https://github.com/AstraBert/PapersChat

reacted to
burtenshaw's
post with ๐๐ค
3 months ago
Post
3680
Hey, Iโm Ben and I work at Hugging Face.
Right now, Iโm focusing on educational stuff and getting loads of new people to build open AI models using free and open source tools.
Iโve made a collection of some of the tools Iโm building and using for teaching. Stuff like quizzes, code challenges, and certificates.
https://huggingface.co/collections/burtenshaw/tools-for-learning-ai-6797453caae193052d3638e2
Right now, Iโm focusing on educational stuff and getting loads of new people to build open AI models using free and open source tools.
Iโve made a collection of some of the tools Iโm building and using for teaching. Stuff like quizzes, code challenges, and certificates.
https://huggingface.co/collections/burtenshaw/tools-for-learning-ai-6797453caae193052d3638e2