AkimfromParis (Akim Mousterou)

reacted to lewtun's post with 🔥 12 months ago

Post

10513

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1

5 replies

·

reacted to burtenshaw's post with 🤗 12 months ago

Post

4101

🚧 Work in Progress! 🚧

👷‍♀️ We're working hard on getting the official agents course ready for the 50,000 students that have signed up.

If you want to contribute to the discussion, I started these community posts. Looking forward to hearing from you:

- smolagents unit in the agents course - agents-course/README#7
- LlamaIndex Unit in the agents course - agents-course/README#6
- LangChain and LangGraph unit in the agents course - agents-course/README#5
- Real world use cases in the agents course - agents-course/README#8

reacted to davidberenstein1957's post with 👀 about 1 year ago

Post

1270

You can now use the "Synthetic Data Generator" at a much larger scale with your preferred inference engine: Ollama, vLLM, TGI, and serverless inference! 🔥

Install, configure, launch!

Space: https://huggingface.co/spaces/argilla/synthetic-data-generator?duplicate=true
Examples: https://github.com/argilla-io/synthetic-data-generator/tree/main/examples

replied to MoritzLaurer's post about 1 year ago

OpenAI sales revenues forecasted at $11.6 billion for 2025. So they will probably be positive.
Maybe you can burn cash, when you have a valuation at $157B?! Numbers are really crazy, only history will tell…

posted an update about 1 year ago

Post

1943

💵 Polymarket is leveraging “Chatbot Arena LLM Leaderboard” on HuggingFace for online gambling on the “Top AI model on January 31?”. 🤗

As of January 3rd, 2025:
-1./ Gemini (83%) -2./ ChatGPT (13%) -3./ Other (2%) -4./ Claude (2%) -5./ Grok (1%) -6./ Llama (<1%)

🇺🇸 The market opinion is following historical data. It's clearly bias towards US historical AI giants, yet Polymarket is forbidden in the USA and for US citizens.

🇨🇳 In the “Other”, you might have Chinese AI labs that are probably the future AI leaders (Qwen, DeepSeek, Yi).

⚖️ In the market resolution, if two models are tied in the evaluation, they will take the alphabetical order. (e.g. if both were tied, “Google” would resolve to “Yes”, and “xAI” would resolve to “No”). 🙃

That might be illegal usage of the Chatbot Arena policy? And maybe HuggingFace? @clem
Or maybe authors and contributors should get a cut each month as “market markers”. @weichiang @angelopoulos

1 reply

·

replied to their post about 1 year ago

This comment has been hidden

replied to their post about 1 year ago

This comment has been hidden

replied to their post about 1 year ago

This comment has been hidden

replied to their post about 1 year ago

This comment has been hidden

replied to their post about 1 year ago

This comment has been hidden

posted an update about 1 year ago

Post

1912

🇺🇸 🇨🇦 🇬🇧 Nobel Prize winners against USSR & Japanese AI pioneers ☭🇯🇵

🇩🇪 Prof. Jürgen Schmidhuber: “The #NobelPrize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original inventors.”

1965 - First Deep Learning - USSR ☭ (Ukraine 🇺🇦 now)
Ivakhnenko and Lapa introduced the first deep learning in deep MLPs that learn internal representations of input data.

1967/68 - Deep Learning by Stochastic Gradient Descent - Japan 🇯🇵
Shun-Ichi Amari trained MLPs with many layers in non-incremental end-to-end fashion from scratch by stochastic gradient descent (SGD).

1969 - Rectified linear unit - Japan 🇯🇵
In 1969, Kunihiko Fukushima introduced ReLU in the context of visual feature extraction in hierarchical neural networks.

1970 - Backpropagation - Finland 🇫🇮 😃
In 1970, Seppo Linnainmaa was the first the reverse mode of automatic differentiation, now known as backpropagation.

1972 - Recurrent Neural Network - Japan 🇯🇵
In 1972, Shun-Ichi Amari published a learning recurrent neural network based on Lenz-Ising model (Amari's net was later called the "Hopfield network". Hopfield republished in 1982, without citing Amari papers.)

1979 - First Convolutional neural network - Japan 🇯🇵
CNN architecture was introduced in 1979 by Kunihiko Fukushima, also known as Neocognitron.

https://people.idsia.ch/~juergen/deep-learning-history.html#AMH2

11 replies

·

reacted to thomwolf's post with 🔥 about 1 year ago

Post

1862

Most liked and most downloaded open-source AI models from 2022 to 2024

Interactive viz: https://aiworld.eu/embed/model/model/treemap
Discussion: huggingface/open-source-ai-year-in-review-2024

reacted to malhajar's post with 👍 about 1 year ago

Post

5293

🇫🇷 Lancement officiel de l'OpenLLM French Leaderboard : initiative open-source pour référencer l’évaluation des LLMs francophones

Après beaucoup d’efforts et de sueurs avec Alexandre Lavallee, nous sommes ravis d’annoncer que le OpenLLMFrenchLeaderboard est en ligne sur Hugging Face (space url: le-leadboard/OpenLLMFrenchLeaderboard) la toute première plateforme dédiée à l’évaluation des grands modèles de langage (LLM) en français. 🇫🇷✨

Ce projet de longue haleine est avant tout une œuvre de passion mais surtout une nécessité absolue. Il devient urgent et vital d'oeuvrer à plus de transparence dans ce domaine stratégique des LLM dits multilingues. La première pièce à l'édifice est donc la mise en place d'une évaluation systématique et systémique des modèles actuels et futurs.

Votre modèle IA français est-il prêt à se démarquer ? Soumettez le dans notre espace, et voyez comment vous vous comparez par rapport aux autres modèles.

❓ Comment ça marche :
Soumettez votre LLM français pour évaluation, et nous le testerons sur des benchmarks de référence spécifiquement adaptés pour la langue française — notre suite de benchmarks comprend :

- BBH-fr : Raisonnement complexe
- IFEval-fr : Suivi d'instructions
- GPQA-fr : Connaissances avancées
- MUSR-fr : Raisonnement narratif
- MATH_LVL5-fr : Capacités mathématiques
- MMMLU-fr : Compréhension multitâche

Le processus est encore manuel, mais nous travaillons sur son automatisation, avec le soutien de la communauté Hugging Face.

@clem , on se prépare pour une mise à niveau de l’espace ? 😏👀

Ce n'est pas qu'une question de chiffres—il s'agit de créer une IA qui reflète vraiment notre langue, notre culture et nos valeurs. OpenLLMFrenchLeaderboard est notre contribution personnelle pour façonner l'avenir des LLM en France.

1 reply

·

replied to John6666's post about 1 year ago

You are probably talking to a Russian or Isreali bots that target French people... Or maybe we might not be the target, but it's HuggingFace. I won 2 (bot) followers today. They are following the same 10 accounts and inside those accounts, you have various fake accounts. It's a ponzi-scheme! 😅

replied to John6666's post about 1 year ago

I checked the email you gave. The emails are French people from far-right and from finance. Sounds like foreign gov bot. It's pretty ironic to do that in here. 😅

replied to John6666's post about 1 year ago

I think they are deliberately using languages other than English to create this misunderstanding. I happened to read the Japanese version, so I realized that it was clearly wrong.😅
Recently, there has been an increase in the number of English posts and SPAM that look like they were created by a generative AI. You can also see so-called “pappeting”, where SPAMs talk to each other.

@John6666 As soon, you liked my post about the Japanese Leaderboard, I got a very weird message in English... Sounds like a spam BOT, but very introspective, incoherent semantic, and fan of anime... : )

replied to their post about 1 year ago

Ps: I added a link to awesome-japanese-llm on the Blog Article. : )

replied to their post about 1 year ago

@kaisugi Thank you! It was a great project and it's barely the beginning.
Hopefully, the open-source community will evaluate more LLMs, and that we will discover more insights.
We have a memory restriction from mdx on Qwen 2.5 72B, but we will found a solution very soon.

posted an update about 1 year ago

Post

1544

🇯🇵 The Open Japanese LLM Leaderboard created by LLM-jp 🌸 in partnership with HuggingFace 🤗 was released today!

Blog: https://huggingface.co/blog/leaderboard-japanese
Space: llm-jp/open-japanese-llm-leaderboard

🌍 The leaderboard is available in both Japanese and English
📚 Based on the evaluation tool, llm-jp-eval with more than 20 datasets for Japanese LLMs
📊 The leaderboard showcases all the metrics for NLP experts, plus averages for NLP beginners
💻 For the comfort of users, we chose a horizontal UI, and implemented it in a light and dark theme on Gradio
🔬 The radar chart provides a very interesting visualization of metrics!
🌱 We are using the Japanese research platform, MDX, so please be patient!
⚡ LLMs bigger than +70B will be evaluated soon…

How do you say “GPUs Go Brrr” in Japanese - > GPUがブンブン～! (To pronounce "GPU ga bunbun!") 🔥

4 replies

·

posted an update over 1 year ago

Post

634

Philosopher Gilles Deleuze in 1985-86 about society of control, probabilities, and power. Visionary words in an era of autoregressive models:

"The biopolitics of populations appears when right sets about administering life, says Foucault, administering life in any open multiplicities whatever. You see the importance of the difference between discipline and biopolitics. The one is in an open space, with large multiplicities to which limits are not assignable. They can only be treated by the calculus of probabilities, hence the development of the calculus of probabilities and the meaning [sens] of the social control of probabilities, the probabilities of marriage in a nation, the probabilities of mortality, probabilities of natality. Natality, nuptiality, mortality …

... When Foucault directly addresses the question of power, namely, one of his great theses: no, power does not repress, or it represses only secondarily. What does it do? It does something much more profound and, doubtless, more formidable that repressing: it forms, it shapes. It does not silence, it does worse: it makes speak. It disciplines, it standardizes [normalise]. But repression is entirely secondary in relation to the positive operations of power.

Power does not repress, it disciplines, it manages, it controls, it standardizes, etcetera. It does not silence, it makes speak. It does not prevent acting, it makes act."

From the Deleuze Seminars at Université Paris 8 translated by Purdue University -> https://deleuze.cla.purdue.edu/

Akim Mousterou

AI & ML interests

Recent Activity

Organizations

Akim Mousterou

AI & ML interests

Recent Activity

Organizations

AkimfromParis's activity