Shyam Sunder Kumar

theainerd

AI & ML interests

Natural Language Processing

Recent Activity

Organizations

Neuropark's profile picture Speech Recognition Community Event Version 2's profile picture Open-Source AI Meetup's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

theainerd's activity

reacted to AdinaY's post with ❤️ 1 day ago
view post
Post
2995
🔥 New reasoning models from the Chinese community, by Skywork 天工-昆仑万维

Skywork/skywork-or1-67fa1bcb41b436ef2def76b9

✨Skywork OR1-Math-7B > Optimized for math reasoning
✨Skywork-OR1-7B-preview > Excels in math & coding
✨Skywork-OR1-32B-preview > Matches Deepseek-R1 on math (AIME24/25) and coding (LiveCodeBench)

Released under the Apache 2.0 license 🥳
Final version coming in 2 weeks!
New activity in theainerd/Wav2Vec2-large-xlsr-hindi 3 days ago

Update README.md

#3 opened about 1 year ago by
indalaiart
reacted to jeffboudier's post with 🚀 10 days ago
view post
Post
2102
Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co
reacted to aiqtech's post with 🔥 17 days ago
view post
Post
7075
✨ High-Resolution Ghibli Style Image Generator ✨
🌟 Introducing FLUX Ghibli LoRA
Hello everyone! Today I'm excited to present a special LoRA model for FLUX Dev.1. This model leverages a LoRA trained on high-resolution Ghibli images for FLUX Dev.1 to easily create beautiful Ghibli-style images with stunning detail! 🎨

space: aiqtech/FLUX-Ghibli-Studio-LoRA
model: openfree/flux-chatgpt-ghibli-lora

🔮 Key Features

Trained on High-Resolution Ghibli Images - Unlike other LoRAs, this one is trained on high-resolution images, delivering sharper and more beautiful results
Powered by FLUX Dev.1 - Utilizing the latest FLUX model for faster generation and superior quality
User-Friendly Interface - An intuitive UI that allows anyone to create Ghibli-style images with ease
Diverse Creative Possibilities - Express various themes in Ghibli style, from futuristic worlds to fantasy elements

🖼️ Sample Images


Include "Ghibli style" in your prompts
Try combining nature, fantasy elements, futuristic elements, and warm emotions
Add "[trigger]" tag at the end for better results

🚀 Getting Started

Enter your prompt (e.g., "Ghibli style sky whale transport ship...")
Adjust image size and generation settings
Click the "Generate" button
In just seconds, your beautiful Ghibli-style image will be created!

🤝 Community
Want more information and tips? Join our community!
Discord: https://discord.gg/openfreeai

Create your own magical world with the LoRA trained on high-resolution Ghibli images for FLUX Dev.1! 🌈✨
reacted to clem's post with 🤗 18 days ago
view post
Post
2391
What's this cool purple banner haha 😶😶😶
·
reacted to Kseniase's post with 👀 23 days ago
view post
Post
5069
8 types of RoPE

As we always use Transformers, it's helpful to understand RoPE—Rotary Position Embedding. Since token order matters, RoPE encodes it by rotating token embeddings based on their position, so the model knows how to interpret which token comes first, second, and so on.

Here are 8 types of RoPE that can be implemented in different cases:

1. Original RoPE -> RoFormer: Enhanced Transformer with Rotary Position Embedding (2104.09864)
Encodes token positions by rotating token embeddings in the complex plane via a position-based rotation matrix, thereby providing the self-attention mechanism with relative positional info.

2. LongRoPE -> LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens (2402.13753)
Extends the context window of pre-trained LLMs to 2048k tokens, leveraging non-uniformities in positional interpolation with an efficient search.

3. LongRoPE2 -> LongRoPE2: Near-Lossless LLM Context Window Scaling (2502.20082)
Extends the effective context window of pre-trained LLMs to the target! length, rescaling RoPE guided by “needle-driven” perplexity.

4. Multimodal RoPE (MRoPE) -> Qwen2.5-VL Technical Report (2502.13923)
Decomposes positional embedding into 3 components: temporal, height and width, so that positional features are aligned across modalities: text, images and videos.

5. Directional RoPE (DRoPE) -> DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling (2503.15029)
Adds an identity scalar, improving how angles are handled without extra complexity. It helps balance accuracy, speed, and memory usage.

6. VideoRoPE -> VideoRoPE: What Makes for Good Video Rotary Position Embedding? (2502.05173)
Adapts RoPE for video, featuring 3D structure, low-frequency temporal allocation, diagonal layout, and adjustable spacing.

7. VRoPE -> VRoPE: Rotary Position Embedding for Video Large Language Models (2502.11664)
An another RoPE for video, which restructures positional indices and balances encoding for uniform spatial focus.

8. XPos (Extrapolatable Position Embedding) -> https://huggingface.co/papers/2212.10
Introduces an exponential decay factor into the rotation matrix​, improving stability on long sequences.
  • 1 reply
·
reacted to onekq's post with 🤯 26 days ago
view post
Post
3756
Folks, let's get ready.🥳 We will be busy soon. 😅🤗https://github.com/huggingface/transformers/pull/36878
reacted to sharpenb's post with 🔥🔥 27 days ago
view post
Post
3082
We open-sourced the pruna package that can be easily installed with pip install pruna :) It allows to easily ccompress and evaluate AI models including transformers and diffusers.

- Github repo: https://github.com/PrunaAI/pruna
- Documentation: https://docs.pruna.ai/en/stable/index.html

With open-sourcing, people can now inspect and contribute to the open code. Beyond the code, we provide detailed readme, tutorials, benchmarks, and documentation to make transparent compression, evaluation, and saving/loading/serving of AI models.

Happy to share it with you and always interested in collecting your feedback :)
  • 2 replies
·