Kenneth Hamilton's picture

Kenneth Hamilton PRO

ZennyKenny

AI & ML interests

Building and enablement @ montebello.ai Certified vibe coder

Recent Activity

updated a dataset about 10 hours ago
ZennyKenny/TRON-dataset-v.1.0
published a dataset about 11 hours ago
ZennyKenny/TRON-dataset-v.1.0
updated a dataset 1 day ago
ZennyKenny/TRON-dataset-demo
View all activity

Organizations

scikit-learn's profile picture TorchGeo's profile picture Kornia AI's profile picture Blog-explorers's profile picture OpenLLM France's profile picture Team Tonic's profile picture ZeroGPU Explorers's profile picture Data is Better Together - Russian Language Team's profile picture The Nevsky Collective's profile picture Plan Communications's profile picture MLX Community's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture Data Is Better Together Contributor's profile picture

ZennyKenny's activity

replied to their post 4 days ago
view reply

Benchmarks nowadays focus on accuracy. It would be great if we could factor in token cost, i.e. delivering the right answer with the fewest tokens. This would motivate the training to be inference efficient.

I used to complain that models don't bother to think if a problem is worthy of reasoning, and push the burden to users. We should do better on this.

Whoa. Good point.

reacted to hesamation's post with 🔥 5 days ago
view post
Post
7236
Google published a 69-page whitepaper on Prompt Engineering and its best practices, a must-read if you are using LLMs in production:
> zero-shot, one-shot, few-shot
> system prompting
> chain-of-thought (CoT)
> ReAct

LINK: https://www.kaggle.com/whitepaper-prompt-engineering
> code prompting
> best practices
replied to their post 7 days ago
view reply

I guess the short answer is that they handle subjective questions better and they improve model output traceability (i.e., better understanding of what informed the model's response).

Agree with your general thoughts on reasoning models though, they aren't the best solution for every use case.

posted an update 7 days ago
reacted to csabakecskemeti's post with 😎 8 days ago
reacted to jsulz's post with 🔥 10 days ago
view post
Post
3590
Huge week for xet-team as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.

Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.

We expect builders on the Hub to see even more improvements, helping power innovation across the community.

With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.

Thanks to the meta-llama team for launching on Xet!
replied to clem's post 15 days ago
view reply

That's largely been the move by Big Tech when they open source or open weight their models I think. What they are actually releasing is just a watered-down version of the product that they are capitalizing. Even Deep Seek and other open source first teams keep their capitalized models private.

Maybe the same people who sold your data out the backdoor whilst championing your "privacy rights" on their platform by letting you block people have suddenly had a massive change of heart, but something tells me the play is just to further increase the gap between Big Tech and emerging players by watering down the market so much that only companies who already have the compute maximalist infrastructure are going to be able to train meaningful models.

Maybe I'm just a cynic though.

posted an update 16 days ago
view post
Post
2121
A few new Russian-language synthetic datasets. The labelling is good, but some of the syntax and grammar is not great.

Great for Russian-language classification models, probably not great for fine-tuning Russian-langauge text generation.

- Virtual Assistant Query / Responses: ZennyKenny/ru_virtual_assistant_chatgpt_distill
- LLM Query / Responses: ZennyKenny/russian_llm_response_chatgpt_distill

Crazy how much language drift is still an issue, especially given that Russian constitutes nearly 5% of the content on the internet.
replied to their post 21 days ago
view reply

Definitely true about the underlying model, but the eval dataset and notebook are!

posted an update 21 days ago
view post
Post
1932
Besides being the coolest named benchmark in the game, HellaSwag is an important measurement of здравый смысль (or common sense) in LLMs.

- More on HellaSwag: https://github.com/rowanz/hellaswag

I spent the afternoon benchmarking YandexGPT Pro 4th Gen, one of the Russian tech giant's premier models.

- Yandex HF Org: yandex
- More on Yandex models: https://yandex.cloud/ru/docs/foundation-models/concepts/yandexgpt/models

The eval notebook is available on GitHub and the resulting dataset is already on the HF Hub!

- Eval Notebook: https://github.com/kghamilton89/ai-explorer/blob/main/yandex-hellaswag/hellaswag-assess.ipynb
- Eval Dataset: ZennyKenny/yandexgptpro_4th_gen-hellaswag

And of course, everyone wants to see the results so have a look at the results in the context of other zero-shot experiments that I was able to find!
  • 2 replies
·
reacted to giadap's post with 🔥 21 days ago
view post
Post
2331
We've all become experts at clicking "I agree" without a second thought. In my latest blog post, I explore why these traditional consent models are increasingly problematic in the age of generative AI.

I found three fundamental challenges:
- Scope problem: how can you know what you're agreeing to when AI could use your data in different ways?
- Temporality problem: once an AI system learns from your data, good luck trying to make it "unlearn" it.
- Autonomy trap: the data you share today could create systems that pigeonhole you tomorrow.

Individual users shouldn't bear all the responsibility, while big tech holds all the cards. We need better approaches to level the playing field, from collective advocacy and stronger technological safeguards to establishing "data fiduciaries" with a legal duty to protect our digital interests.

Available here: https://huggingface.co/blog/giadap/beyond-consent
reacted to onekq's post with 🚀 29 days ago
view post
Post
2301
Introducing 🎉 OneSQL-v0.1🥳, our first text-to-SQL model based on Qwen2.5-Coder. This model has achieved an EX score of 63.33 on the BIRD leaderboard (https://bird-bench.github.io/).

The model family includes 7B and 32B,
onekq-ai/onesql-v01-qwen-67d8e3eb1611c5532bb90c5f
and can be also found on Ollama (https://ollama.com/onekq/OneSQL-v0.1-Qwen)

My goal is to make OneSQL the most usable open-weights model for text-to-SQL. I'm currently working on best practices to help users use this model the right away and avoid pitfalls. After that, I plan to train the next version to push for a higher EX score.

Enjoy this model and feel free to share comments/questions 🤗
  • 1 reply
·
replied to burtenshaw's post about 1 month ago
reacted to burtenshaw's post with 🤗 about 1 month ago
view post
Post
2028
Here’s a notebook to make Gemma reason with GRPO & TRL. I made this whilst prepping the next unit of the reasoning course:

In this notebooks I combine together google’s model with some community tooling

- First, I load the model from the Hugging Face hub with transformers’s latest release for Gemma 3
- I use PEFT and bitsandbytes to get it running on Colab
- Then, I took Will Browns processing and reward functions to make reasoning chains from GSM8k
- Finally, I used TRL’s GRPOTrainer to train the model

Next step is to bring Unsloth AI in, then ship it in the reasoning course. Links to notebook below.

https://colab.research.google.com/drive/1Vkl69ytCS3bvOtV9_stRETMthlQXR4wX?usp=sharing
·
reacted to mcpotato's post with 🤗 about 1 month ago
view post
Post
2482
Stoked to announce we've partnered with JFrog to continue improving safety on the Hub! 🐸

Their model scanner brings new scanning capabilities to the table, aimed at reducing alert fatigue.

More on that in our blog post: https://huggingface.co/blog/jfrog
  • 1 reply
·
reacted to fdaudens's post with 🔥 about 1 month ago
view post
Post
4108
AI will bring us "a country of yes-men on servers" instead of one of "Einsteins sitting in a data center" if we continue on current trends.

Must-read by @thomwolf deflating overblown AI promises and explaining what real scientific breakthroughs require.

https://thomwolf.io/blog/scientific-ai.html
  • 2 replies
·
reacted to albertvillanova's post with 🔥 about 1 month ago
view post
Post
3896
🚀 Big news for AI agents! With the latest release of smolagents, you can now securely execute Python code in sandboxed Docker or E2B environments. 🦾🔒

Here's why this is a game-changer for agent-based systems: 🧵👇

1️⃣ Security First 🔐
Running AI agents in unrestricted Python environments is risky! With sandboxing, your agents are isolated, preventing unintended file access, network abuse, or system modifications.

2️⃣ Deterministic & Reproducible Runs 📦
By running agents in containerized environments, you ensure that every execution happens in a controlled and predictable setting—no more environment mismatches or dependency issues!

3️⃣ Resource Control & Limits 🚦
Docker and E2B allow you to enforce CPU, memory, and execution time limits, so rogue or inefficient agents don’t spiral out of control.

4️⃣ Safer Code Execution in Production 🏭
Deploy AI agents confidently, knowing that any generated code runs in an ephemeral, isolated environment, protecting your host machine and infrastructure.

5️⃣ Easy to Integrate 🛠️
With smolagents, you can simply configure your agent to use Docker or E2B as its execution backend—no need for complex security setups!

6️⃣ Perfect for Autonomous AI Agents 🤖
If your AI agents generate and execute code dynamically, this is a must-have to avoid security pitfalls while enabling advanced automation.

⚡ Get started now: https://github.com/huggingface/smolagents

What will you build with smolagents? Let us know! 🚀💡
replied to their post about 1 month ago
view reply

Actually the model I've used is a distill of LLaMa so it meets the criteria of Free as in Freedom. Shoutout rms.

posted an update about 1 month ago
view post
Post
526
It took me a while, but I've finally got it working: ZennyKenny/note-to-text

Using a Meta LLaMa checkpoint from Unsloth and some help from the HF community, you can capture handwritten notes and convert them into digital format in just a few second.

Really exciting times for AI builders on Hugging Face.
  • 2 replies
·
reacted to Bils's post with 👍 about 2 months ago