Ksenia Se
Kseniase
·
AI & ML interests
None yet
Recent Activity
reacted
to
their
post
with 🚀
about 20 hours ago
8 types of RoPE
As we always use Transformers, it's helpful to understand RoPE—Rotary Position Embedding. Since token order matters, RoPE encodes it by rotating token embeddings based on their position, so the model knows how to interpret which token comes first, second, and so on.
Here are 8 types of RoPE that can be implemented in different cases:
1. Original RoPE -> https://huggingface.co/papers/2104.09864
Encodes token positions by rotating token embeddings in the complex plane via a position-based rotation matrix, thereby providing the self-attention mechanism with relative positional info.
2. LongRoPE -> https://huggingface.co/papers/2402.13753
Extends the context window of pre-trained LLMs to 2048k tokens, leveraging non-uniformities in positional interpolation with an efficient search.
3. LongRoPE2 -> https://huggingface.co/papers/2502.20082
Extends the effective context window of pre-trained LLMs to the target! length, rescaling RoPE guided by “needle-driven” perplexity.
4. Multimodal RoPE (MRoPE) -> https://huggingface.co/papers/2502.13923
Decomposes positional embedding into 3 components: temporal, height and width, so that positional features are aligned across modalities: text, images and videos.
5. Directional RoPE (DRoPE) -> https://huggingface.co/papers/2503.15029
Adds an identity scalar, improving how angles are handled without extra complexity. It helps balance accuracy, speed, and memory usage.
6. VideoRoPE -> https://huggingface.co/papers/2502.05173
Adapts RoPE for video, featuring 3D structure, low-frequency temporal allocation, diagonal layout, and adjustable spacing.
7. VRoPE -> https://huggingface.co/papers/2502.11664
An another RoPE for video, which restructures positional indices and balances encoding for uniform spatial focus.
8. XPos (Extrapolatable Position Embedding) -> https://huggingface.co/papers/2212.10
Introduces an exponential decay factor into the rotation matrix, improving stability on long sequences.
View all activity
Organizations
Kseniase's activity
-
-
-
-
-
-
-
-
-
-
-
view article
What is Qwen-Agent framework? Inside the Qwen family
view article
🌁#92: Fight for Developers and the Year of Orchestration
view article
🦸🏻#14: What Is MCP, and Why Is Everyone – Suddenly!– Talking About It?
view article
🦸🏻#13: Action! How AI Agents Execute Tasks with UI and API Tools
view article
🦸🏻#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI
view article
Everything You Need to Know about Knowledge Distillation
view article
🌁#89: AI in Action: How AI Engineers, Self-Optimizing Models, and Humanoid Robots Are Reshaping 2025
published
an
article
about 1 month ago
published
an
article
about 1 month ago
published
an
article
about 1 month ago
view article
🌁#88: Can DeepSeek Inspire Global Collaboration?
published
an
article
about 1 month ago
published
an
article
about 1 month ago
view article
Topic 27: What are Chain-of-Agents and Chain-of-RAG?
published
an
article
about 1 month ago
published
an
article
about 2 months ago