Jean Louis

JLouisBiz

https://www.StartYourOwnGoldMine.com

AI & ML interests

- LLM for sales, marketing, promotion - LLM for Website Revision System - increasing quality of communication with customers - helping clients access information faster - saving people from financial troubles

Recent Activity

new activity 1 day ago

swiss-ai/Apertus-70B-2509:I just got here from Hacker News, and it said "Truly Open" but then I am forced to give my personal information to get Apache-2.0 model, makes no sense

commented on an article about 1 month ago

An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct

reacted to merve's post with 👍 about 2 months ago

Now it's possible to do RAG with any-to-any models 🔥 Learn how to search in a video dataset and generate using https://huggingface.co/Tevatron/OmniEmbed-v0.1-multivent an all modality retriever, and https://huggingface.co/Qwen/Qwen2.5-Omni-7B, any-to-any model in this notebook 🤝 https://huggingface.co/merve/smol-vision/blob/main/Any_to_Any_RAG.ipynb

View all activity

Organizations

reacted to merve's post with 👍 about 2 months ago

Post

2830

Now it's possible to do RAG with any-to-any models 🔥

Learn how to search in a video dataset and generate using Tevatron/OmniEmbed-v0.1-multivent an all modality retriever, and Qwen/Qwen2.5-Omni-7B, any-to-any model in this notebook 🤝 merve/smol-vision

reacted to fdaudens's post with 👍 about 2 months ago

Post

2577

You might not have heard of Moonshot AI — but within 24 hours, their new model Kimi K2 shot to the top of Hugging Face’s trending leaderboard.

So… who are they, and why does it matter?

Had a lot of fun co-writing this blog post with @xianbao , with key insights translated from Chinese, to unpack how this startup built a model that outperforms GPT-4.1, Claude Opus, and DeepSeek V3 on several major benchmarks.

🧵 A few standout facts:

1. From zero to $3.3B in 18 months:
Founded in March 2023, Moonshot is now backed by Alibaba, Tencent, Meituan, and HongShan.

2. A CEO who thinks from the end:
Yang Zhilin (31) previously worked at Meta AI, Google Brain, and Carnegie Mellon. His vision? Nothing less than AGI — still a rare ambition among Chinese AI labs.

3. A trillion-parameter model that’s surprisingly efficient:
Kimi K2 uses a mixture-of-experts architecture (32B active params per inference) and dominates on coding/math benchmarks.

4. The secret weapon: Muon optimizer:
A new training method that doubles efficiency, cuts memory in half, and ran 15.5T tokens with zero failures. Big implications.

Most importantly, their move from closed to open source signals a broader shift in China’s AI scene — following Baidu’s pivot. But as Yang puts it: “Users are the only real leaderboard.”

👇 Check out the full post to explore what Kimi K2 can do, how to try it, and why it matters for the future of open-source LLMs:
https://huggingface.co/blog/fdaudens/moonshot-ai-kimi-k2-explained

replied to AdinaY's post 2 months ago

No, the Pangu Model License Agreement Version 1.0 is not a free software license. It imposes significant restrictions, such as prohibiting use within the European Union (Section 3) and requiring attribution (Section 4.2), which conflict with the principles of free software licenses like the GNU GPL or Open Source Definition. The non-transferable clause (Section 2) and indemnity requirement (Section 7) further deviate from standard free software terms.

🔥 "Open Model"? More Like "Openly Restrictive"! 🔥

Huawei calls Pangu Pro MoE an "open model"? That’s like calling a locked door an "open invitation." Let’s break down the brilliant "openness" here:

"No EU Allowed!" (Section 3) – Because nothing says "open" like banning entire continents. GDPR too scary for you, Huawei?
"Powered by Pangu" or GTFO (Section 4.2) – Mandatory branding? Real open-source models don’t force you to be a walking billboard.
Non-transferable license (Section 2) – Can’t pass it on? So much for community sharing.
Indemnify Huawei for your use (Section 7) – If anything goes wrong, you pay, not them. How generous!

This isn’t an "open model"—it’s a marketing stunt wrapped in proprietary chains. True open-source (Apache, MIT, GPL) doesn’t come with geographic bans, forced attribution, and legal traps.

Huawei, either commit to real openness or stop insulting the FOSS community with this pretend-free nonsense. 🚮

replied to a-r-r-o-w's post 2 months ago

"not commercial" license isn't "Open Source", so please be accurate to users.

Reference:

The Open Source Definition – Open Source Initiative:
https://opensource.org/osd

replied to fdaudens's post 2 months ago

Gemma License (danger) is not Free Software and is not Open Source:
https://gnu.support/gnu-emacs/emacs-lisp/Gemma-License-danger-is-not-Free-Software-and-is-not-Open-Source.html

So the goal of Google is just their monopoly and dependence of users. I suggest using fully free, free as in freedom, LLMs.

reacted to AdinaY's post with 🔥👍 3 months ago

Post

1641

LongWriter-Zero 🔥 A Purely RL trained LLM handles 10K+ token coherent passages by Tsinghua University

Model:
THU-KEG/LongWriter-Zero-32B
Dataset:
THU-KEG/LongWriter-Zero-RLData
Paper:
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning (2506.18841)

✨ 32B
✨ Multi-reward GRPO: length, fluency, structure, non-redundancy
✨ Enforces <think><answer> format via Format RM
✨ Build on Qwen2.5-32B-base

reacted to BFFree's post with 👍 3 months ago

Post

2970

Working on some chess set concepts. I went towards minimal sculpted shapes then returned to some traditionalism.

1 reply

replied to giadap's post 3 months ago

Okay, please keep researching so that you can get more tools for Uganda, Kenya and Tanzania.

reacted to giadap's post with 🔥 3 months ago

Post

1922

🗣️ Whose voice do we hear when AI speaks?

Every language carries its own cultural values and worldviews. So, when we build AI systems, we're not just deciding how they speak but also whose perspectives they represent.

Even choosing which dialect to train on in Norway becomes a question of inclusion and power. In Kenya, will AI speak Swahili from Nairobi or coastal regions? What about indigenous languages with rich oral traditions but limited written text, like Quechua in Peru or Cherokee in North America?

The path forward? Building WITH communities, not just FOR them. Working with local partners (libraries, universities, civil society), testing for cultural alignment, and asking hard questions about representation.

Just published some thoughts on this after my keynote in Norway a few weeks ago: https://huggingface.co/blog/giadap/when-ai-speaks

1 reply

replied to yeonseok-zeticai's post 3 months ago

Thank you, that is interesting, but where is the link? Is it going to work on 24 GB VRAM?

reacted to brainhome's post with 👍 3 months ago

Post

1768

Trinity-Synthesis: A Multi-Agent Architecture for AI Agents That Think Before They Speak
Ever felt your AI agent is "shooting from the hip"? It latches onto a single line of thought and fails to produce a robust, well-rounded plan. This is a common struggle I've called the "AI Reasoning Paradox."

To tackle this, I developed Trinity-Synthesis, a multi-agent architecture designed to force reflection and synthesis before delivering a final answer. The philosophy is simple: constructive conflict between different perspectives leads to better solutions.

Here’s the core idea:

Instead of one agent, it uses four agents running on the same base model but with different "personalities" defined by their system prompts and temperature settings:

🧠 The Visionary: Thinks outside the box (high temp: 1.0).
📊 The Analyst: Focuses on logic, data, and structure (low temp: 0.3).
🛠️ The Pragmatist: Evaluates feasibility, costs, and risks (mid temp: 0.5).
These three "thinkers" work in parallel on the same problem. Then, a final Synthesizer agent critically analyzes their outputs, rejects flawed arguments, and integrates the best points into a single, coherent, and often superior strategy.

The result is a more robust reasoning process that balances creativity with analytical rigor, making it ideal for solving complex, strategic problems where answer quality is critical.

I've written a deep dive on how it works, including a detailed case study ("The Helios Initiative") and the Python source code for you to experiment with.

Read the full article on Medium:
https://medium.com/@brainhome9/trinity-synthesis-how-i-built-an-ai-agent-that-thinks-before-it-speaks-d45d45c2827c

I'd love to hear your feedback and see what you build with it!

#AI #AIAgents #LLM #Reasoning #MultiAgent

4 replies

reacted to codelion's post with 👍 4 months ago

Post

2862

🧬 Hey everyone! Just released **OpenEvolve** - an open-source implementation of Google DeepMind's AlphaEvolve system.

It's an evolutionary coding agent that uses LLMs to discover and optimize algorithms. I successfully replicated DeepMind's results on circle packing (99.97% match!) and evolved a random search into a simulated annealing algorithm.

✨ Key features:
- Evolves entire codebases (not just single functions)
- Works with any OpenAI-compatible API
- LLM ensemble approach for better results
- Multi-objective optimization

👉 Check it out:
GitHub: https://github.com/codelion/openevolve
Blog post: https://huggingface.co/blog/codelion/openevolve

Would love to hear your thoughts or answer any questions about it!

replied to onekq's post 4 months ago

Gemini's proprietary license is a deal-breaker. It's not just about performance—it's about freedom. Google's terms actively restrict libre use, while models like QwQ 32B and DeepSeek v3 (when properly licensed) respect user rights. Never conflate ethically-licensed AI with corporate traps that forbid modification, redistribution, or independent use.

reacted to as-cle-bert's post with 👍 4 months ago

Post

1959

One of the biggest challenges I've been facing since I started developing [𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧](https://github.com/AstraBert/PdfItDown) was handling correctly the conversion of files like Excel sheets and CSVs: table conversion was bad and messy, almost unusable for downstream tasks🫣

That's why today I'm excited to introduce 𝐫𝐞𝐚𝐝𝐞𝐫𝐬, the new feature of PdfItDown v1.4.0!🎉

With 𝘳𝘦𝘢𝘥𝘦𝘳𝘴, you can choose among three (for now👀) flavors of text extraction and conversion to PDF:

- 𝗗𝗼𝗰𝗹𝗶𝗻𝗴, which does a fantastic work with presentations, spreadsheets and word documents🦆

- 𝗟𝗹𝗮𝗺𝗮𝗣𝗮𝗿𝘀𝗲 by LlamaIndex, suitable for more complex and articulated documents, with mixture of texts, images and tables🦙

- 𝗠𝗮𝗿𝗸𝗜𝘁𝗗𝗼𝘄𝗻 by Microsoft, not the best at handling highly structured documents, by extremly flexible in terms of input file format (it can even convert XML, JSON and ZIP files!)✒️

You can use this new feature in your python scripts (check the attached code snippet!😉) and in the command line interface as well!🐍

Have fun and don't forget to star the repo on GitHub ➡️ https://github.com/AstraBert/PdfItDown

reacted to fdaudens's post with 👍 4 months ago

Post

3210

Forget everything you know about transcription models - NVIDIA's parakeet-tdt-0.6b-v2 changed the game for me!

Just tested it with Steve Jobs' Stanford speech and was speechless (pun intended). The video isn’t sped up.

3 things that floored me:
- Transcription took just 10 seconds for a 15-min file
- Got a CSV with perfect timestamps, punctuation & capitalization
- Stunning accuracy (correctly captured "Reed College" and other specifics)

NVIDIA also released a demo where you can click any transcribed segment to play it instantly.

The improvement is significant: number 1 on the ASR Leaderboard, 6% error rate (best in class) with complete commercial freedom (cc-by-4.0 license).

Time to update those Whisper pipelines! H/t @Steveeeeeeen for the finding!

Model: nvidia/parakeet-tdt-0.6b-v2
Demo: nvidia/parakeet-tdt-0.6b-v2
ASR Leaderboard: hf-audio/open_asr_leaderboard

1 reply

reacted to AdinaY's post with 👍 4 months ago

Post

3163

DeepSeek, Alibaba, Skywork, Xiaomi, Bytedance.....
And that’s just part of the companies from the Chinese community that released open models in April 🤯

zh-ai-community/april-2025-open-releases-from-the-chinese-community-67ea699965f6e4c135cab10f

🎬 Video
> MAGI-1 by SandAI
> SkyReels-A2 & SkyReels-V2 by Skywork
> Wan2.1-FLF2V by Alibaba-Wan

🎨 Image
> HiDream-I1 by Vivago AI
> Kimi-VL by Moonshot AI
> InstantCharacter by InstantX & Tencent-Hunyuan
> Step1X-Edit by StepFun
> EasyControl by Shanghai Jiaotong University

🧠 Reasoning
> MiMo by Xiaomi
> Skywork-R1V 2.0 by Skywork
> ChatTS by ByteDance
> Kimina by Moonshot AI & Numina
> GLM-Z1 by Zhipu AI
> Skywork OR1 by Skywork
> Kimi-VL-Thinking by Moonshot AI

🔊 Audio
> Kimi-Audio by Moonshot AI
> IndexTTS by BiliBili
> MegaTTS3 by ByteDance
> Dolphin by DataOceanAI

🔢 Math
> DeepSeek Prover V2 by Deepseek

🌍 LLM
> Qwen by Alibaba-Qwen
> InternVL3 by Shanghai AI lab
> Ernie4.5 (demo) by Baidu

📊 Dataset
> PHYBench by Eureka-Lab
> ChildMandarin & Seniortalk by BAAI

Please feel free to add if I missed anything!

reacted to ZennyKenny's post with 👍 4 months ago

Post

2743

I've created a new dataset using the Algorithm of Thoughts architecture proposed by Sel et al. (2023) in a reasoning context. (paper: https://arxiv.org/pdf/2308.10379)

The dataset simulates the discovery phase of a fictitious VC firm called Reasoned Capital and, once expanded, can be used to create models which are able to make complex, subjective financial decisions based on different criteria.

The generation process encourages recursive problem-solving in increasingly complex prompts to encourage models to assess and reevaluate the conclusions and generated opinions of upstream models. Pretty neat stuff, and I'm not aware of this architecture being used in a reasoning context anywhere else.

Check it out: ZennyKenny/synthetic_vc_financial_decisions_reasoning_dataset

reacted to AdinaY's post with 🔥 4 months ago

Post

5137

Kimi-Audio 🚀🎧 an OPEN audio foundation model released by Moonshot AI
moonshotai/Kimi-Audio-7B-Instruct
✨ 7B
✨ 13M+ hours of pretraining data
✨ Novel hybrid input architecture
✨ Universal audio capabilities (ASR, AQA, AAC, SER, SEC/ASC, end-to-end conversation)

reacted to jasoncorkill's post with 🔥 4 months ago

Post

5541

🚀 Building Better Evaluations: 32K Image Annotations Now Available

Today, we're releasing an expanded version: 32K images annotated with 3.7M responses from over 300K individuals which was completed in under two weeks using the Rapidata Python API.

Rapidata/text-2-image-Rich-Human-Feedback-32k

A few months ago, we published one of our most liked dataset with 13K images based on the @data-is-better-together 's dataset, following Google's research on "Rich Human Feedback for Text-to-Image Generation" (https://arxiv.org/abs/2312.10240). It collected over 1.5M responses from 150K+ participants.

Rapidata/text-2-image-Rich-Human-Feedback

In the examples below, users highlighted words from prompts that were not correctly depicted in the generated images. Higher word scores indicate more frequent issues. If an image captured the prompt accurately, users could select [No_mistakes].

We're continuing to work on large-scale human feedback and model evaluation. If you're working on related research and need large, high-quality annotations, feel free to get in touch: [email protected].

Jean Louis

AI & ML interests

Recent Activity

Organizations

JLouisBiz's activity