3 9 25

Alfaxad Eyembe

Alfaxad

https://alfaxad.github.io/

AI & ML interests

AI, Robotics

Recent Activity

liked a model 5 days ago

shisa-ai/shisa-v2-llama3.1-405b-GGUF

liked a dataset 6 days ago

MaziyarPanahi/Llama-Nemotron-Post-Training-Dataset-v1-ShareGPT

upvoted a collection 16 days ago

MedGemma Release

View all activity

Organizations

None yet

Alfaxad's activity

liked a model 5 days ago

shisa-ai/shisa-v2-llama3.1-405b-GGUF

Updated about 21 hours ago • 816 • 2

liked a dataset 6 days ago

MaziyarPanahi/Llama-Nemotron-Post-Training-Dataset-v1-ShareGPT

Viewer • Updated 7 days ago • 30.2M • 495 • 39

upvoted a collection 16 days ago

MedGemma Release

Collection

Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 4 items • Updated 10 days ago • 153

upvoted an article 24 days ago

Article

I trained a Language Model to schedule events with GRPO!

•

Apr 29

• 76

upvoted an article 26 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

and 2 others •

Jan 23

• 180

upvoted an article 27 days ago

Article

Everything You Need to Know about Knowledge Distillation

and 1 other •

Mar 6

• 26

liked a model 28 days ago

HuggingFaceTB/SmolLM-135M

Text Generation • Updated Aug 1, 2024 • 93.3k • 206

upvoted an article about 1 month ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

and 3 others •

Mar 12

• 427

liked a model about 1 month ago

unsloth/SmolLM-1.7B

Text Generation • Updated Sep 23, 2024 • 2.97k • 2

upvoted an article about 1 month ago

Article

SmolLM - blazingly fast and remarkably powerful

and 2 others •

Jul 16, 2024

• 375

upvoted a collection about 1 month ago

🪐 SmolLM

Collection

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 227

New activity in yuntian-deng/ChatGPT about 1 month ago

Update app.py

#21 opened about 1 month ago by

Alfaxad

upvoted an article about 1 month ago

Article

LLM Inference at scale with TGI

•

Sep 6, 2024

• 19

reacted to AdinaY's post with 🔥 about 1 month ago

Post

5127

Kimi-Audio 🚀🎧 an OPEN audio foundation model released by Moonshot AI
moonshotai/Kimi-Audio-7B-Instruct
✨ 7B
✨ 13M+ hours of pretraining data
✨ Novel hybrid input architecture
✨ Universal audio capabilities (ASR, AQA, AAC, SER, SEC/ASC, end-to-end conversation)

reacted to etemiz's post with 👀 about 2 months ago

Post

2187

It looks like Llama 4 team gamed the LMArena benchmarks by making their Maverick model output emojis, longer responses and ultra high enthusiasm! Is that ethical or not? They could certainly do a better job by working with teams like llama.cpp, just like Qwen team did with Qwen 3 before releasing the model.

In 2024 I started playing with LLMs just before the release of Llama 3. I think Meta contributed a lot to this field and still contributing. Most LLM fine tuning tools are based on their models and also the inference tool llama.cpp has their name on it. The Llama 4 is fast and maybe not the greatest in real performance but still deserves respect. But my enthusiasm towards Llama models is probably because they rank highest on my AHA Leaderboard:

https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08

Looks like they did a worse job compared to Llama 3.1 this time. Llama 3.1 has been on top for a while.

Ranking high on my leaderboard is not correlated to technological progress or parameter size. In fact if LLM training is getting away from human alignment thanks to synthetic datasets or something else (?), it could be easily inversely correlated to technological progress. It seems there is a correlation regarding the location of the builders (in the West or East). Western models are ranking higher. This has become more visible as the leaderboard progressed, in the past there was less correlation. And Europeans seem to be in the middle!

Whether you like positive vibes from AI or not, maybe the times are getting closer where humans may be susceptible to being gamed by an AI? What do you think?

4 replies

reacted to BrigitteTousi's post with 🚀 2 months ago

Post

3171

AI agents are transforming how we interact with technology, but how sustainable are they? 🌍

Design choices — like model size and structure — can massively impact energy use and cost. ⚡💰 The key takeaway: smaller, task-specific models can be far more efficient than large, general-purpose ones.

🔑 Open-source models offer greater transparency, allowing us to track energy consumption and make more informed decisions on deployment. 🌱 Open-source = more efficient, eco-friendly, and accountable AI.

Read our latest, led by @sasha with assists from myself + @yjernite 🤗
https://huggingface.co/blog/sasha/ai-agent-sustainability

1 reply

reacted to Jaward's post with 🚀🔥 3 months ago

Post

1939

This is the most exciting of this week’s release for me: Gemini Robotics - A SOTA generalist Vision-Language-Action model that brings intelligence to the physical world. It comes with a verifiable real-world knowledge Embodied Reasoning QA benchmark. Cool part is that the model can be specialized with fast adaptation to new tasks and have such adaptations transferred to new robot embodiment like humanoids. Looking forward to the model and data on hf, it’s about time I go full physical:)
Technical Report: https://storage.googleapis.com/deepmind-media/gemini-robotics/gemini_robotics_report.pdf

liked a model 3 months ago

sesame/csm-1b

Text-to-Speech • Updated 12 days ago • 44.7k • 2.08k

upvoted an article 3 months ago

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

and 1 other •

Mar 7

• 60