Paris AI Running Club

community

AI & ML interests

None defined yet.

Recent Activity

paris-ai-running-club's activity

merveย 
posted an update 2 days ago
view post
Post
2203
So many open releases at Hugging Face past week ๐Ÿคฏ recapping all here โคต๏ธ merve/march-21-releases-67dbe10e185f199e656140ae

๐Ÿ‘€ Multimodal
> Mistral AI released a 24B vision LM, both base and instruction FT versions, sota ๐Ÿ”ฅ (OS)
> with IBM we released SmolDocling, a sota 256M document parser with Apache 2.0 license (OS)
> SpatialLM is a new vision LM that outputs 3D bounding boxes, comes with 0.5B (QwenVL based) and 1B (Llama based) variants
> SkyWork released SkyWork-R1V-38B, new vision reasoning model (OS)

๐Ÿ’ฌ LLMs
> NVIDIA released new Nemotron models in 49B and 8B with their post-training dataset
> LG released EXAONE, new reasoning models in 2.4B, 7.8B and 32B
> Dataset: Glaive AI released a new reasoning dataset of 22M+ examples
> Dataset: NVIDIA released new helpfulness dataset HelpSteer3
> Dataset: OpenManusRL is a new agent dataset based on ReAct framework (OS)
> Open-R1 team released OlympicCoder, new competitive coder model in 7B and 32B
> Dataset: GeneralThought-430K is a new reasoning dataset (OS)

๐Ÿ–ผ๏ธ Image Generation/Computer Vision
> Roboflow released RF-DETR, new real-time sota object detector (OS) ๐Ÿ”ฅ
> YOLOE is a new real-time zero-shot object detector with text and visual prompts ๐Ÿฅน
> Stability AI released Stable Virtual Camera, a new novel view synthesis model
> Tencent released Hunyuan3D-2mini, new small and fast 3D asset generation model
> ByteDance released InfiniteYou, new realistic photo generation model
> StarVector is a new 8B model that generates svg from images
> FlexWorld is a new model that expands 3D views (OS)

๐ŸŽค Audio
> Sesame released CSM-1B new speech generation model (OS)

๐Ÿค– Robotics
> NVIDIA released GR00T, new robotics model for generalized reasoning and skills, along with the dataset

*OS ones have Apache 2.0 or MIT license
clemย 
posted an update 4 days ago
view post
Post
3429
Should we assemble affordable open-source robots at Hugging Face for the community. Would you buy them? At what price?
ยท
clemย 
posted an update 5 days ago
view post
Post
2354
Nice new space to see how fast your personal or organization followers are growing on HF:
julien-c/follow-history

As you can see, I still have more followers than @julien-c even if he's trying to change this by building such cool spaces ๐Ÿ˜๐Ÿ˜๐Ÿ˜
clemย 
posted an update 11 days ago
view post
Post
4548
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
ยท
not-lainย 
posted an update 12 days ago
julien-cย 
posted an update 13 days ago
view post
Post
2505
Important notice ๐Ÿšจ

For Inference Providers who have built support for our Billing API (currently: Fal, Novita, HF-Inference โ€“ with more coming soon), we've started enabling Pay as you go (=PAYG)

What this means is that you can use those Inference Providers beyond the free included credits, and they're charged to your HF account.

You can see it on this view: any provider that does not have a "Billing disabled" badge, is PAYG-compatible.
clemย 
posted an update 16 days ago
view post
Post
7225
I was chatting with @peakji , one of the cofounders of Manu AI, who told me he was on Hugging Face (very cool!).

He shared an interesting insight which is that agentic capabilities might be more of an alignment problem rather than a foundational capability issue. Similar to the difference between GPT-3 and InstructGPT, some open-source foundation models are simply trained to 'answer everything in one response regardless of the complexity of the question' - after all, that's the user preference in chatbot use cases. Just a bit of post-training on agentic trajectories can make an immediate and dramatic difference.

As a thank you to the community, he shared 100 invite code first-come first serve, just use โ€œHUGGINGFACEโ€ to get access!
ยท
clemย 
posted an update 17 days ago
albertvillanovaย 
posted an update 17 days ago
view post
Post
3654
๐Ÿš€ New smolagents update: Safer Local Python Execution! ๐Ÿฆพ๐Ÿ

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. ๐Ÿ”’

Here's why this matters & what you need to know! ๐Ÿงต๐Ÿ‘‡

1๏ธโƒฃ Why is local execution risky? โš ๏ธ
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2๏ธโƒฃ New Safety Layer in smolagents ๐Ÿ›ก๏ธ
We now inspect every return value during execution:
โœ… Allowed: Safe built-in types (e.g., numbers, strings, lists)
โ›” Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3๏ธโƒฃ Immediate Benefits ๐Ÿ’ก
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4๏ธโƒฃ Security Disclaimer โš ๏ธ
๐Ÿšจ Despite these improvements, local Python execution is NEVER 100% safe. ๐Ÿšจ
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5๏ธโƒฃ The Best Practice: Use Sandboxed Execution ๐Ÿ”
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6๏ธโƒฃ Upgrade Now & Stay Safe! ๐Ÿš€
Check out the latest smolagents release and start building safer AI agents today.

๐Ÿ”— https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Letโ€™s discuss! ๐Ÿ‘‡

#AI #smolagents #Python #Security
  • 2 replies
ยท
albertvillanovaย 
posted an update 18 days ago
view post
Post
3836
๐Ÿš€ Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. ๐Ÿฆพ๐Ÿ”’

Here's why this is a game-changer for agent-based systems: ๐Ÿงต๐Ÿ‘‡

1๏ธโƒฃ Security First ๐Ÿ”
Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.

2๏ธโƒฃ Deterministic & Reproducible Runs ๐Ÿ“ฆ
By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable settingโ€”no more environment mismatches or dependency issues!

3๏ธโƒฃ Resource Control & Limits ๐Ÿšฆ
Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents donโ€™t spiral out of control.

4๏ธโƒฃ Safer Code Execution in Production ๐Ÿญ
Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.

5๏ธโƒฃ Easy to Integrate ๐Ÿ› ๏ธ
With smolagents, you can simply configure your agent to use Docker or E2B as its execution backendโ€”no need for complex security setups!

6๏ธโƒฃ Perfect for Autonomous AI Agents ๐Ÿค–
If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.

โšก Get started now: https://github.com/huggingface/smolagents

What will you build with smolagents? Let us know! ๐Ÿš€๐Ÿ’ก
clemย 
posted an update 20 days ago
view post
Post
5899
Super happy to welcome Nvidia as our latest enterprise hub customer. They have almost 2,000 team members using Hugging Face, and close to 20,000 followers of their org. Can't wait to see what they'll open-source for all of us in the coming months!

Nvidia's org: https://huggingface.co/nvidia
Enterprise hub: https://huggingface.co/enterprise
zamalย 
posted an update 22 days ago
view post
Post
1950
๐Ÿš€ ftBoost is LIVE โ€“ Stop Struggling with Fine-Tuning Data!

Alright folks, if youโ€™re tired of manually crafting fine-tuning datasets, ftBoost is here to do the heavy lifting. One-click, LangChain-Groq-powered data augmentation that scales your training data in OpenAI, Gemini, Mistral, and LLaMA formatsโ€”automatically.

๐Ÿ”ฅ Whatโ€™s inside?
โœ… Smart Augmentations โ€“ Paraphrasing, back translation, synonym swapping & synthetic noise.
โœ… No more JSONL headaches โ€“ Auto-formats everything for OpenAI, Gemini, Mistral & LLaMA.
โœ… Custom tuning โ€“ Adjust similarity, diversity, and fluency in real-time.
โœ… Upload, generate, download โ€“ Thatโ€™s it.

โšก If youโ€™re fine-tuning LLMs, this will save you hours.

๐Ÿš€ Try it now: ๐Ÿ‘‰ zamal/Finetune-Boost

๐ŸŒŸ Give us a star on GitHub!

Let me know what you think & how it boosts your workflow! ๐Ÿ”ฅ
merveย 
posted an update about 1 month ago
view post
Post
6414
Google just released PaliGemma 2 Mix: new versatile instruction vision language models ๐Ÿ”ฅ

> Three new models: 3B, 10B, 28B with res 224, 448 ๐Ÿ’™
> Can do vision language tasks with open-ended prompts, understand documents, and segment or detect anything ๐Ÿคฏ

Read more https://huggingface.co/blog/paligemma2mix
Try the demo google/paligemma2-10b-mix
All models are here google/paligemma-2-mix-67ac6a251aaf3ee73679dcc4
clemย 
posted an update about 1 month ago
view post
Post
2829
What are the best organizations to follow on @huggingface ?

On top of my head:
- Deepseek (35,000 followers): https://huggingface.co/deepseek-ai
- Meta Llama (27,000 followers): https://huggingface.co/meta-llama
- Black Forrest Labs (11,000 followers): https://huggingface.co/black-forest-labs
- OpenAI (5,000 followers): https://huggingface.co/openai
- Nvidia (16,000 followers): https://huggingface.co/nvidia
- MIcrosoft (9,000 followers): https://huggingface.co/microsoft
- AllenAI (2,000 followers): https://huggingface.co/allenai
- Mistral (5,000 followers): https://huggingface.co/mistralai
- XAI (600 followers): https://huggingface.co/xai-org
- Stability AI (16,000 followers): https://huggingface.co/stabilityai
- Qwen (16,000 followers): https://huggingface.co/Qwen
- GoogleAI (8,000 followers): https://huggingface.co/google
- Unsloth (3,000 followers): https://huggingface.co/unsloth
- Bria AI (4,000 followers): https://huggingface.co/briaai
- NousResearch (1,300 followers): https://huggingface.co/NousResearch

Bonus, the agent course org with 17,000 followers: https://huggingface.co/agents-course
  • 1 reply
ยท
clemย 
posted an update about 1 month ago
view post
Post
3488
We crossed 1B+ tokens routed to inference providers partners on HF, that we released just a few days ago.

Just getting started of course but early users seem to like it & always happy to be able to partner with cool startups in the ecosystem.

Have you been using any integration and how can we make it better?

https://huggingface.co/blog/inference-providers
merveย 
posted an update about 1 month ago
view post
Post
4756
Your weekly recap of open AI is here, and it's packed with models! merve/feb-14-releases-67af876b404cc27c6d837767

๐Ÿ‘€ Multimodal
> OpenGVLab released InternVideo 2.5 Chat models, new video LMs with long context
> AIDC released Ovis2 model family along with Ovis dataset, new vision LMs in different sizes (1B, 2B, 4B, 8B, 16B, 34B), with video and OCR support
> ColQwenStella-2b is a multilingual visual retrieval model that is sota in it's size
> Hoags-2B-Exp is a new multilingual vision LM with contextual reasoning, long context video understanding

๐Ÿ’ฌ LLMs
A lot of math models!
> Open-R1 team released OpenR1-Math-220k large scale math reasoning dataset, along with Qwen2.5-220K-Math fine-tuned on the dataset, OpenR1-Qwen-7B
> Nomic AI released new Nomic Embed multilingual retrieval model, a MoE with 500 params with 305M active params, outperforming other models
> DeepScaleR-1.5B-Preview is a new DeepSeek-R1-Distill fine-tune using distributed RL on math
> LIMO is a new fine-tune of Qwen2.5-32B-Instruct on Math

๐Ÿ—ฃ๏ธ Audio
> Zonos-v0.1 is a new family of speech recognition models, which contains the model itself and embeddings

๐Ÿ–ผ๏ธ Vision and Image Generation
> We have ported DepthPro of Apple to transformers for your convenience!
> illustrious-xl-v1.0 is a new illustration generation model
ยท
merveย 
posted an update about 2 months ago
view post
Post
3138
Interesting releases in open AI this week, let's recap ๐Ÿค  merve/feb-7-releases-67a5f7d7f172d8bfe0dd66f4

๐Ÿค– Robotics
> Pi0, first open-source foundation vision-language action model was released in Le Robot (Apache 2.0)

๐Ÿ’ฌ LLMs
> Groundbreaking: s1 is simpler approach to test-time scaling, the release comes with small s1K dataset of 1k question-reasoning trace pairs (from Gemini-Thinking Exp) they fine-tune Qwen2.5-32B-Instruct to get s1-32B, outperforming o1-preview on math ๐Ÿคฏ s1-32B and s1K is out!
> Adyen released DABstep, a new benchmark along with it's leaderboard demo for agents doing data analysis
> Krutrim released Krutrim-2 instruct, new 12B model based on NeMo12B trained and aligned on Indic languages, a new multilingual sentence embedding model (based on STSB-XLM-R), and a translation model for Indic languages

๐Ÿ‘€ Multimodal
> PKU released Align-DS-V, a model aligned using their new technique called LLF for all modalities (image-text-audio), along with the dataset Align Anything
> OLA-7B is a new any-to-any model by Tencent that can take text, image, video, audio data with context window of 32k tokens and output text and speech in English and Chinese
> Krutrim released Chitrarth, a new vision language model for Indic languages and English

๐Ÿ–ผ๏ธ Vision
> BiRefNet_HR is a new higher resolution BiRefNet for background removal

๐Ÿ—ฃ๏ธ Audio
> kyutai released Hibiki, it's a real-time speech-to-speech translation model ๐Ÿคฏ it's available for French-English translation
> Krutrim released Dhwani, a new STT model for Indic languages
> They also release a new dataset for STT-TTS

๐Ÿ–ผ๏ธ Image Generation
> Lumina released Lumina-Image-2.0, a 2B parameter-flow based DiT for text to image generation
> Tencent released Hunyuan3D-2, a 3D asset generation model based on DiT and Hunyuan3D-Paint
> boreal-hl-v1 is a new boring photorealistic image generation LoRA based on Hunyuan
merveย 
posted an update about 2 months ago
albertvillanovaย 
posted an update about 2 months ago
view post
Post
3847
๐Ÿš€ Introducing @huggingface Open Deep-Research๐Ÿ’ฅ

In just 24 hours, we built an open-source agent that:
โœ… Autonomously browse the web
โœ… Search, scroll & extract info
โœ… Download & manipulate files
โœ… Run calculations on data

55% on GAIA validation set! Help us improve it!๐Ÿ’ก
https://huggingface.co/blog/open-deep-research
  • 3 replies
ยท