MultiTransformer (Multi🤖Transformers)

mindchain

posted an update about 9 hours ago

Post

63

Scaling Physical AI: SAM 3D, NVIDIA Cosmos, and Unreal Engine!

The "Sim-to-Real" gap is officially history. In early 2026, we are no longer just rendering data; we are simulating reality. By bridging Meta’s SAM 3D, Unreal Engine, and the NVIDIA Cosmos suite, we’ve built an autonomous pipeline for Physical AI that evolves itself.

The 2026 Tech Stack:
SAM 3D: Generates high-fidelity digital twins from 2D photos in seconds.

Unreal Engine + MCP: The AI "Director" orchestrates environments via the Model Context Protocol, providing perfect Ground Truth.

NeMo Data Designer: The orchestration hub on GitHub. Following NVIDIA’s acquisition of Gretel in early 2025, its leading generative privacy and tabular tech are now fully integrated here.

NVIDIA Cosmos Transfer: Neural rendering that adds hyper-realism to Unreal Engine outputs.

NVIDIA Cosmos Predict: Predicts physically accurate motion (falling, sliding) without manual animation.

NVIDIA Cosmos Reason: The automated supervisor checking every frame for logical and physical consistency.

The Workflow:
Asset Capture: SAM 3D turns real-world photos into Nanite meshes for Unreal Engine.

Orchestration: NeMo Data Designer (with Gretel-powered integrity) defines the data schema, while AI builds the world in Unreal Engine.

Completion: NVIDIA Cosmos (Transfer & Predict) adds photorealism and physics, while NVIDIA Cosmos Reason guarantees quality.

By combining Gretel’s data heritage with the visual power of Unreal Engine, we generate 100,000 perfect frames per hour. Weights and tools are on Hugging Face. Stop labeling. Start simulating.

#PhysicalAI #SAM3D #NVIDIACosmos #UnrealEngine #NeMo #Gretel #SyntheticData #HuggingFace #Robotics #AI #ComputerVision

mindchain

posted an update 1 day ago

Post

745

Skill Reflect: A Concept for Automated AI Skill Mastery

Let’s be real for a second: most of us are using AI all wrong. We send a prompt, get a "meh" answer, and then spend twenty minutes fixing it ourselves. That’s not a workflow; that’s just a digital chore. I wanted to see if I could push Claude further—to see if I could build a system that actually learns and refines itself. That’s how the Claude-Reflect-System (Skill Reflect) was born.

But here’s the thing: this isn’t some polished, final product. It’s a concept. It’s a blueprint. I’ve built the foundation of a recursive reflection loop that forces the AI to step back, look at its work, and act as its own harshest critic. It identifies the "skill delta"—the gap between "okay" and "mastery"—and closes it. This logic isn't just for Claude; you can grab this architecture and drop it right into codex-cli, terminal agents, or whatever stack you're building.

I’m a big believer in the law of causality. Action, reaction. Cause and effect. If you control the cause—the way the AI thinks about its mistakes—you dictate the effect: a perfected skill. This is a playground for builders who are tired of stochastic guessing. I want you to take this. Fork it. Break it. Make it better. This is an open invitation to the community to take this reflection loop and see how far we can push the boundaries of agentic reasoning. Whether you're building Claude Code plugins or just want to automate your self-learning, the code is there for you to smash. Stop accepting the first draft. Let’s build something that actually thinks.

https://github.com/haddock-development/claude-reflect-system

#Skills #ClaudeCode #ClaudeCodeSkills #ClaudeCodePlugins #ClaudeCodeMarketplace #CodexCLI #AI #SelfLearning #Automation #OpenSource #LLM #Reasoning #Causality #Matrix #Concept

mindchain

posted an update 2 days ago

Post

1799

Neural Traffic Control: Orchestrating Multi-Path Reasoning 🚥
The future of AI isn't just about "better" models—it’s about high-precision orchestration. We are moving from linear processing to Parallel MTP-Reasoning, where we manage neural traffic across stabilized, transparent, and recursive highways.

1️⃣ The Backbone: Stabilized High-Dimensional Routing (arXiv:2512.24880) Using DeepSeek’s mHC (Manifold-Constrained Hyper-Connections), we solve the instability of deep MoE architectures. By projecting weight updates onto the Birkhoff Polytope, we ensure that our "Simpsons-style" expert lanes maintain mathematical identity. This is the hardware-level stability needed to run multiple reasoning paths without collapse.

2️⃣ The Vision: Gemma Scope 2 & Feature Steering You can't steer what you can't see. Gemma Scope 2 provides the "X-ray" for our highways. By using Sparse Autoencoders (SAEs), our Meta-Controller identifies the active features in each expert lane. We don't just route data; we route intent by monitoring feature-drift in real-time.

3️⃣ The Logic: Recursive Open Meta-Agents (arXiv:2512.24601) We integrate the ROMA (Recursive Open Meta-Agent) framework. Instead of a flat response, the model operates in a recursive loop, refining its internal state before any output occurs. This is the "brain" of our [Meta-Controller GitHub Repo], enabling the model to simulate and discard weak logic internally.

4️⃣ The Simulation: Parallel MTP-Reasoning This is where it comes together: Multi-Token Prediction (MTP) meets Parallel Simulation. Our Python-driven controller runs three parallel Gemma 3 instances.

The Process: 3 paths generated simultaneously.

The Filter: A 500-token lookahead window.

The Decision: The Meta-Controller uses SAE-data from Gemma Scope to select the path with the highest logical fidelity.

The Result: A self-correcting, transparent, and multi-threaded reasoning engine. We aren't just scaling parameters; we are scaling architectural precision. 🧠

mindchain

posted an update 4 days ago

Post

3613

The Architecture of 2026: Beyond the Token Trap 🚀

We are witnessing a tectonic shift in Transformer architecture. It’s no longer just about "predicting the next token"—it’s about executing latent plans on a high-speed data highway.

What happens when we combine DeepSeek’s stability with Google’s strategic intelligence?

1️⃣ The Infrastructure: DeepSeek’s mHC Moving from a single-lane residual stream to a multi-lane highway. Using the Birkhoff Polytope, mHC ensures mathematical stability (Identity Mapping) while routing specialized data through dedicated lanes.

2️⃣ The Intelligence: Google’s Meta-Controller An internal AI unit that lives inside the Transformer. It escapes the "Token Trap" by extracting data to create a latent plan, steering the model via Temporal Abstraction.

The Synergy: In a Topological Transformer, the Meta-Controller finally has the "dedicated lanes" it needs to steer complex reasoning without causing gradient explosions.

We aren't just making models bigger; we are making them architecturally smarter. 🧠

#MachineLearning #DeepSeek #GoogleAI #Transformer #AIArchitecture

Reubencf

posted an update 4 days ago

Post

3130

Happy New Year 2026
i have planned to build many things this year , most of them will be cheaper or free alternative's to paid products

i am looking forward to release some useful spaces ✌️ Stay Tuned !

1 reply

·

Reubencf

posted an update 9 days ago

Post

2636

As 2025 is ending i would like to thank everyone for trying out
Reubencf/Nano_Banana_Editor

looking forward to build and release more in the future for the open source community

Parveshiiii

posted an update 16 days ago

Post

3537

Hey everyone!
We’re excited to introduce our new Telegram group: https://t.me/XenArcAI

This space is built for **model builders, tech enthusiasts, and developers** who want to learn, share, and grow together. Whether you’re just starting out or already deep into AI/ML, you’ll find a supportive community ready to help with knowledge, ideas, and collaboration.

💡 Join us to:
- Connect with fellow developers and AI enthusiasts
- Share your projects, insights, and questions
- Learn from others and contribute to a growing knowledge base

👉 If you’re interested, hop in and be part of the conversation: https://t.me/XenArcAI

12 replies

·

daavoo

posted an update 19 days ago

Post

1844

2025: The Year of Agents.
2026: The Year of Local Agents?

Relying on cloud-hosted LLMs is often overkill. While frontier models still lead in complex coding, local models are now more than capable of handling many agentic workflows—with zero latency and total privacy.

To help bridge the gap between local inference and usable agents, I’m releasing agent.cpp: https://github.com/mozilla-ai/agent.cpp

It provides minimal, high-performance building blocks for agents in C++, built directly around the awesome llama.cpp ecosystem.
Stop sending your data to a remote API. Start building and running agents on your own hardware.

1 reply

·

Nymbo

posted an update 20 days ago

Post

1956

🚨 New tool for the Nymbo/Tools MCP server: The new Agent_Skills tool provides full support for Agent Skills (Claude Skills but open-source).

How it works: The tool exposes the standard discover/info/resources/validate actions. Skills live in /Skills under the same File_System root, and any bundled scripts run through Shell_Command, no new infrastructure required.

Agent_Skills(action="discover")  # List all available skills
Agent_Skills(action="info", skill_name="music-downloader")  # Full SKILL.md
Agent_Skills(action="resources", skill_name="music-downloader")  # Scripts, refs, assets

I've included a music-downloader skill as a working demo, it wraps yt-dlp for YouTube/SoundCloud audio extraction.

Caveat: On HF Spaces, Shell_Command works for most tasks, but some operations (like YouTube downloads) are restricted due to the container environment. For full functionality, run the server locally on your machine.

Try it out ~ https://www.nymbo.net/nymbot

Reubencf

posted an update 22 days ago

Post

4834

Great News !
Reubencf/Nano_Banana_Editor Now supports black-forest-labs/FLUX.1-Kontext-dev and Qwen/Qwen-Image-Edit-2509

Just log in with Huggingface and try it out

KingNish

posted an update 27 days ago

Post

2438

Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1

KingNish

posted an update 29 days ago

Post

2489

I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here 👉 https://huggingface.co/blog/KingNish/optimizer-part1

1 reply

·

Nymbo

posted an update about 1 month ago

Post

5188

🚀 I've just shipped a major update to the Nymbo/Tools MCP server: the Agent_Terminal, a single "master tool" that cuts token usage by over 90%!

Anthropic found 98.7% context savings using code execution with MCP, Cloudflare published similar findings. This is my open-source implementation of the same idea.

# The Problem

Traditional MCP exposes every tool definition directly to the model. With 12 tools, that's thousands of tokens consumed *before the conversation even starts*. Each tool call also passes intermediate results through the context window — a 10,000-row spreadsheet? That's all going into context just to sum a column.

# The Solution: One Tool to Rule Them All

Agent_Terminal wraps all 12 tools (Web_Search, Web_Fetch, File_System, Generate_Image, Generate_Speech, Generate_Video, Deep_Research, Memory_Manager, Obsidian_Vault, Shell_Command, Code_Interpreter) into a single Python code execution gateway.

Instead of the model making individual tool calls, it writes Python code that orchestrates the tools directly:

# Search for Bitcoin price
result = Web_Search("current price of bitcoin", max_results=3)
print(result)

Don't know what tools are available? The agent can discover them at runtime:

print(search_tools('image'))  # Find tools by keyword
print(usage('Generate_Image'))  # Get full docs for a specific tool

The individual direct tool calls are all still there, but they can be disabled if using the Agent_Terminal. Try it now - https://www.nymbo.net/nymbot

1 reply

·

Reubencf

posted an update about 1 month ago

Post

2509

Hey everyone! 👋

I am thrilled to present MCP-1st-Birthday/Reuben_OS my submission for the Hugging Face MCP 1st Birthday Hackathon (Creative Track).

ReubenOS is a virtual cloud-based operating system designed specifically to act as a backend for Claude Desktop via the Model Context Protocol (MCP). It gives Claude a persistent environment to work in!

✨ Key Features

* 📱 Flutter IDE: Claude can write Flutter code and I can view/execute the files directly in the ReubenOS dashboard.
* 🎵 AI Audio Studio: Integrated with ElevenLabs to generate songs and voiceovers from text prompts within Claude.
* 🔒 Secure File System: A passkey-protected file system (private & public folders) to store code, JSON, and documents.
* 🧠 Gemini Integration: Access Google's Gemini model directly inside the OS.
* 📝 Quiz Engine: Ask Claude to "Create a Python quiz," and it deploys a graded interactive quiz to the web instantly.

9 replies

·

Parveshiiii

posted an update about 2 months ago

Post

1643

Another banger from XenArcAI! 🔥

We’re thrilled to unveil three powerful new releases that push the boundaries of AI research and development:

🔗 XenArcAI/SparkEmbedding-300m

- A lightning-fast embedding model built for scale.
- Optimized for semantic search, clustering, and representation learning.

🔗 XenArcAI/CodeX-7M-Non-Thinking

- A massive dataset of 7 million code samples.
- Designed for training models on raw coding patterns without reasoning layers.

🔗 XenArcAI/CodeX-2M-Thinking

- A curated dataset of 2 million code samples.
- Focused on reasoning-driven coding tasks, enabling smarter AI coding assistants.

Together, these projects represent a leap forward in building smarter, faster, and more capable AI systems.

💡 Innovation meets dedication.
🌍 Knowledge meets responsibility.

Parveshiiii

posted an update about 2 months ago

Post

3045

SparkEmbedding - SoTA cross lingual retrieval

Iam very happy to announce our latest embedding model sparkembedding-300m base on embeddinggemma-300m we fine tuned it on 1m extra examples spanning over 119 languages and result is this model achieves exceptional cross lingual retrieval

Model: XenArcAI/SparkEmbedding-300m

lunarflu

posted an update 2 months ago

Post

871

The #1 trending AI/ML dataset today 🏆

Massive scale, diversity and end-to-end potential from nvidia !
nvidia/PhysicalAI-Autonomous-Vehicles

lunarflu

posted an update 2 months ago

Post

630

The new King 👑has arrived!

Moonshot AI now the top model on Hugging Face 🔥
moonshotai/Kimi-K2-Thinking

lunarflu

posted an update 2 months ago

Post

2758

💸🤑You don’t need 100 GPUs to train something amazing!

Our Smol Training Playbook teaches you a better path to world-class LLMs, for free!

Check out the #1 trending space on 🤗 :
HuggingFaceTB/smol-training-playbook

Nymbo

posted an update 2 months ago

Post

1123

I've added an 11th tool to the Nymbo/Tools MCP server, it's for your Obsidian_Vault. I'd argue it's far more context-efficient than any other Obsidian MCP I've seen, and doesn't require any plugins. Also some big improvements to the Web_Search and Web_Fetch tools.

# Obsidian_Vault Tool

It's basically a read-only version of the File_System tool, but it works so well for navigating Obsidian without unnecessary context. It supports recursive (full-text) search across the entire vault, and supports offset so the agent can "scroll" through a document without re-consuming tokens.

Run the server locally and set the OBSIDIAN_VAULT_ROOT environment variable to your vault's root path. If you don't use Obsidian, this is perfectly usable as simply a read-only filesystem.

# Web_Search Improvements

The Web_Search tool previously just used DuckDuckGo as a backend search engine, but now it also supports Bing, Brave, Yahoo, and Wikipedia. Default engine is auto which provides results from all backends in recommended order. Still doesn't require any kind of API or auth for Web_Search.

There's also a new date filter to limit results to those created in the past day, week, month, or year. Oh, and uhh, SafeSearch is now off by default :)

# Web_Fetch Improvements

As context-efficient as the Markdown mode is for web browsing, sometimes it does lose important context in the conversion from HTML to Markdown. So I've added a new HTML mode to the Web_Fetch tool that basically executes a cURL request on the URL, returning the full HTML page if necessary.

# A Note on Claude Skills

I've been having fun with the new File_System and Shell_Command tools. Using Claude Skills doesn't currently work in the public HF space because of environment restrictions, but using Skills works perfectly well running locally.

Happy building ~

AI & ML interests

Team members 106

MultiTransformer's activity