As we always use Transformers, it's helpful to understand RoPE—Rotary Position Embedding. Since token order matters, RoPE encodes it by rotating token embeddings based on their position, so the model knows how to interpret which token comes first, second, and so on.
Here are 8 types of RoPE that can be implemented in different cases:
4. Multimodal RoPE (MRoPE) -> Qwen2.5-VL Technical Report (2502.13923) Decomposes positional embedding into 3 components: temporal, height and width, so that positional features are aligned across modalities: text, images and videos.
8. XPos (Extrapolatable Position Embedding) -> https://huggingface.co/papers/2212.10 Introduces an exponential decay factor into the rotation matrix, improving stability on long sequences.
Attention mechanisms allow models to dynamically focus on specific parts of their input when performing tasks. In our recent article, we discussed Multi-Head Latent Attention (MLA) in detail and now it's time to summarize other existing types of attention.
Here is a list of 15 types of attention mechanisms used in AI models:
3. Self-attention -> Attention Is All You Need (1706.03762) Each element in the sequence "looks" at other elements and "decides" how much to borrow from each of them for its new representation.
5. Multi-Head Attention (MHA) -> Attention Is All You Need (1706.03762) Multiple attention “heads” are run in parallel. The model computes several attention distributions (heads), each with its own set of learned projections of queries, keys, and values.
🚀 New smolagents update: Safer Local Python Execution! 🦾🐍
With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. 🔒
Here's why this matters & what you need to know! 🧵👇
1️⃣ Why is local execution risky? ⚠️ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.
2️⃣ New Safety Layer in smolagents 🛡️ We now inspect every return value during execution: ✅ Allowed: Safe built-in types (e.g., numbers, strings, lists) ⛔ Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)
4️⃣ Security Disclaimer ⚠️ 🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨 If you need true isolation, use a remote sandboxed executor like Docker or E2B.
5️⃣ The Best Practice: Use Sandboxed Execution 🔐 For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.
6️⃣ Upgrade Now & Stay Safe! 🚀 Check out the latest smolagents release and start building safer AI agents today.
✔️ modification of the cross-entropy loss function designed specifically for training LLMs. ✔️ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance. ✔️ more stable and efficient training, leading to models that generalize better.
Check it out, give it a spin, and let me know what you think!
Licensed under the Apache 2.0 license and ready to use. Happy training! 🔥🤖
made a few improvements on custom grpo trainer: - added sequence similarity reward (seems to work) - improved vllm support (5x inference speed) - adjusted reward scores (this helped with format/accuracy) - can now push to hf hub (already pushed mine lol: Jaward/smollm2_360m_grpo_gsm8k_reasoner)
It now includes : - a live stream of the progress being made on the task (see included video), - The following components: 1. Automatic prompt optimization 2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop) 3. A coding agent to complete the task 4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved. 5. A testing agent that tests the approved code or provides information on how to test it. 6. A documentation agent that provides documentation and a help message for the approved and tested code.
After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.
BT-SFT introduces: ✅ Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data. ✅ Reward function that ensures responses are well-structured.
Not many seemed to notice but what was probably meant to be a WIN for artist's rights in the US Office of Copyright has solved some fundamental issues for the community. In our recent article I outline how Companies like Suno, OpenAI, Midjourney etc can no longer claim any right to copy your work that you create with their platforms We also look at other ways this study and new rules for AI will fundamentally effect creators who use it and companies incentives to give them control over certain aspects might change because of this. it's broken down pretty well here: https://huggingface.co/blog/fuzzy-mittenz/copyright-in-ai