🚩 Report: Spam
Model is fake & spam. For instance, compare model-00008-of-01939.safetensors to OpenGVLab/InternVL3-78B model-00008-of-00033.safetensors. You can see it in the file’s Large File Pointer Details and the raw pointer: 110bc8463ce8fa4c51b9492a83761aadf9c7d5ff227d4c7b461a61eedf3c3682. The same is true for many other files here, they match exactly another model and are not the model proposed on the HF page.
Another example, shard 442
https://huggingface.co/Qwen/Qwen3-Coder-480B-A35B-Instruct/blob/main/model-00185-of-00241.safetensors
42beb029c4f6b8b57a8642471550aedffa50c3f120ae65595d3eb37f59f43ed8
Shard 621
https://huggingface.co/Qwen/Qwen3-235B-A22B-Thinking-2507/blob/main/model-00005-of-00118.safetensors
5c40ca3e04508ca33cba860330220ba183b268f144a7e38799b80841b714f985
etc, every model piece is copied from somewhere else and not an actual full model.
At the very least if DynaMoE is something that uses all these models it is not using the proper licensing for each model taken & used here (without mention of them anywhere).
The index.json is also complete nonsense and uses tensor names that dont exist: https://huggingface.co/deca-ai/3-alpha-ultra/blob/main/model.safetensors.index.json
You can also see in the commit history where they felt the need to update the total size of the safetensors index so HF reports their nonsense correctly (afaik, correct me if wrong) https://huggingface.co/deca-ai/3-alpha-ultra/commit/61ecc89970416ee8695f00979551a6d6884783bb
It's a very good lesson on how to spot either fake spam or horrifically engineered models, if anything! :)
Let me answer this:
One: this isn’t spam. Deca 3 Alpha is an experiment, and yes, it’s scaffolded from existing models. That was intentional and mentioned upfront. We’re testing routing, reproducibility, and scaling — we didn't pretrain this
two, all reused components are properly licensed. We’ll be adding a NOTICE.md to clarify provenance, including InternVL.
three, the commits you see were because HF wasn't correctly reporting the size when we added the entire model
horrifically engineered models
True, but I wouldn't say it that way. I'd say:
under-engineered models
I think it's time we understand why we have "alpha" in the name.
We are going to update the readme to reflect this. We are going to add that notice file. We are going to release everything transparently, but till then please be patient.
Show us where you run this model and some outputs.
Don't believe me? Try for yourself: https://deca.genlabs.dev/chat But it uses only one expert because everyone is in such a rush
Don't believe me? Try for yourself: https://deca.genlabs.dev/chat But it uses only one expert because everyone is in such a rush
Giving all benefit of the doubt that it was a very badly worded release and not just lying, this endpoint also doesn't work (yet).
yes, it’s scaffolded from existing models.
This should have been clarified in the readme.
the commits you see were because HF wasn't correctly reporting the size when we added the entire model
Well, yes, this happens because it cant properly read the specified tensors from the Safetensors files. At best it's a misuse of the .index.json :p
True, but I wouldn't say it that way. I'd say:
under-engineered models
I don't disagree that this is possible, but also:
- There are no router weights to route to the different psuedo-experts (at least not that I saw)
- The readme acts like it was pretrained on a huge dataset
- 99% of projects like this are done by scaffolding around other API providers, and I'm not sure what benefit monorepoing like this is supposed to provide in this case.
Answers
There are no router weights to route to the different psuedo-experts (at least not that I saw)
That's a part of dynamoe software. To keep everything transparent, I've release my v0.0001 ai-cli here: https://github.com/genlabsAI/ai-cli This doesn't have router weights yet, but it will.
The readme acts like it was pretrained on a huge dataset
It's ChatGPT generated. We didn't want to leave it blank. I even mentioned this on the reddit post
Show us where you run this model and some outputs.
AI is powerful but I can't vibecode a chat interface in 10 minutes
Why did you make a Reddit post advertising your model if nothing is ready? No benchmarks, no test results, no published paper, no working service, no working inference code, no details? What is the point?
In addition you can not ignore the licenses of the models you cobbled together and slap your own on top. If you can't or won't state all the involved models, I would say it's a safe assume this frankenmoe is filled with conflicting licenses.
I'm sorry, but did they really just jam a bunch of existing models together (with or without the expert selection layer, doesn't make much of a difference at that point), renamed all of them, without credit, added an incompatible license on top, and released that as a new MoE?
Because if so, lol, you should associate with DavidAU, he's a specialist in broken models like that, but at least he's not hiding what the base models are.
There are no router weights to route to the different psuedo-experts (at least not that I saw)
That's a part of dynamoe software. To keep everything transparent, I've release my v0.0001 ai-cli here: https://github.com/genlabsAI/ai-cli This doesn't have router weights yet, but it will.
This one appreciates the attempt, but... this software doesn't seem compatible with the weights released here? At all?
Take for example here in list_experts it expects a directory of experts, which just isnt there no matter how you look at it
@Lockout @Fizzarolli @mimizukari The chat is live https://deca.genlabs.dev/chat. I has a system prompt, if I remove that it will say the routed expert's name when asked who are you
The reason comes out.
I get it now. It is pretty good idea. Just load Qwen 235B on the backend and say it is Alpha 4.8T. And ask for money you would expect from a 4.8T model inference. And then your vict... customer can't even say that he is getting the wrong model if 235B is part of your fake 4.8T model.
Don't believe me? Try for yourself: https://deca.genlabs.dev/chat But it uses only one expert because everyone is in such a rush
The chat is not available now.
The reason comes out.
I get it now. It is pretty good idea. Just load Qwen 235B on the backend and say it is Alpha 4.8T. And ask for money you would expect from a 4.8T model inference. And then your vict... customer can't even say that he is getting the wrong model if 235B is part of your fake 4.8T model.
2.5 is very different from 3 alpha. 2.5 also uses similar architecture, but 2.5 ultra uses parallel thinking. Also dynamoe will have emsembling, so you really are paying for better performance in Deca 3
I don’t want to be dismissive nor defensive, but maybe y’all should wait until dynamoe comes and then rethink it.
There's not even a config.json ...
Creative website :3
const simulateQueue = (element) => {
return new Promise(resolve => {
const queueSize = Math.floor(Math.random() * 5) + 4;
let currentPosition = queueSize;
That’s from gptoss. If you don’t believe me see the weight s
Only in openai/gpt-oss-120b: 24
Only in deca-ai/3-alpha-ultra: 1940
Total files compared: 1964
Only in openai/gpt-oss-20b: 6
Only in deca-ai/3-alpha-ultra: 1940
Total files compared: 1946
Looking at the hashes didn't help :(
Why did you make a Reddit post advertising your model if nothing is ready? No benchmarks, no test results, no published paper, no working service, no working inference code, no details? What is the point?
In addition you can not ignore the licenses of the models you cobbled together and slap your own on top. If you can't or won't state all the involved models, I would say it's a safe assume this frankenmoe is filled with conflicting licenses.
to clarify I was the one who made that post. My friend came across this model the other day, and I decided to share it on reddit X)
Yes that post isn't ours. We didn't even expect all this. Please wait until we have stuff ready and then start complaining if you want.
There's not even a config.json ...
There is. It is dynamoeconfig
@SaisExperiments That's to throttle people
There are many ways to implement a delay without being misleading to the user :D
"I'm not doing anything shady"
Proceeds to:
- Copy paste a bunch of 3rd party models together and advertise the "result" as the largest biggest baddest model ever
- "Forget" to credit the models and companies
- Attempt to "sell" the inference compute despite not being allowed to do so by at least half of said companies
- Apply an incompatible license on top of those models
- "Forget" to include the most basic config file
- Not include an actual MoE selection layer (and even then, the word Expert is a misnomer here)
- Provide improbable fine-tuning / training information
- Write a chat program that lies about what its doing (it's tots throttling!)
- Have a parent company's website that is shady AF: it barely runs properly, and all important pages (privacy, info and so on) are 404
- Not speaking English properly, while having to vibe-code a simple chat front-end
I wonder why people would have doubts... Such a mystery...
Totally fair to be skeptical—especially with how rough the release looked. It was early, messy, and missing pieces we were still preparing. But it wasn’t meant to be a polished product. It was a testbed for routing, reproducibility, and scale behavior using DynaMoE.
Yes, parts were copied. That was intentional. Yes, the docs were incomplete. That’s being fixed. And yes, the site’s still under construction. We’re building fast, in public, and not everything lands cleanly the first time.
But if you’re willing to wait and see, you’ll find that this isn’t smoke and mirrors. It’s a real architecture, with real routing logic, and a roadmap that’s moving quickly. The beta (and the real Deca 3) will speak for itself.
Appreciate the scrutiny—even when it’s sharp. It helps us build better.
Are you vibe-pr-responding?
Yes I am, if you don't mind -_- (and btw this isn't a PR)
But this is still the most truest thing I can say right now:
...if you’re willing to wait and see, you’ll find that this isn’t smoke and mirrors...
This one appreciates the attempt, but... this software doesn't seem compatible with the weights released here? At all?
Take for example here in list_experts it expects a directory of experts, which just isnt there no matter how you look at it
It's supposed to decode .dynamoeconfig into this folder
Fraud, con artist, scammer, those all describe you to a T. It doesn't matter how nice you act, it's just that, an act, you will do anything to pull one over on people to get their money. You can't vibe code your way to making a model supposedly better than GPT-5 and Gemini Pro 2.5.
I can state with certainty your model is not any better. Full stop. You are lying and making absurd claims.
Deca 2.5 isn't vibe code.
First of all, Qwen/Deepseek is almost as good as gemini 2.5
Secondly, routing can improve performance
Third,
you will do anything to pull one over on people to get their money
Don't pay me if you don't like it. I'm not scamming anybody.
Moral: Prove your point, or wait for me to prove mine
Deca 2.5 isn't vibe code.
First of all, Qwen/Deepseek is almost as good as gemini 2.5
Secondly, routing can improve performance
Third,you will do anything to pull one over on people to get their money
Don't pay me if you don't like it. I'm not scamming anybody.
Yep, it's all vibe coded, fraud, fake, where is your paper? Why is nothing ready yet you are claiming it's better, without any benchmarks?? What are your credentials? Who are you?
Why is your site a broken mess? Why do you think it's normal to roll out a model, and have nothing ready? Why do you think this model should cost more than all the SOTA models? Why should anyone pay you a dime vs just using the real models, which are much cheaper?
You don't fool me, you can try all you like, perhaps you are just delusional, but I'd be more inclined to call you for what you know you are a scammer.
I don't need to prove anything, you, who is likely a small boy, have to prove your claims. That your clown show of a model is better than all other SOTA models.
And 2 million token context?! Wow, how does a vibe coding scammer manage to best Google I wonder?
Have you heard of something called minimax?
Okay, I feel like my point flew above your head. I'm not surprised tbh. Point was: you. are. obviously. a. scammer. I'm very confused as to why you're still here after being exposed, though. There's no winning strategy here for you.
However, for those who want to dig deeper, check their github history, the people who make commits, google their (nick)names. It's a whole rabbit hole :D They aren't that good at opsec.
I've heard of something called cope, and make every excuse ever to try to explain nonsense. Hows Iceland? It's interesting that a small American company has their domain contact in Iceland. https://who.is/rdap/genlabs.dev
Spaceship is maybe from iceland.
If you want the truth:
Both Deca 2.5 and 3 are built on existing models. They will work. Whether it is worth it or not is totally up to you. But they will work.
There's no winning strategy here for you.
Then why are you talking. If I'm here to fail let me fail. And if I'm here to succeed, do not hinder me.
That's not how it works, I'm rather impressed how you manage to lie so easily, perhaps that is one thing you are good at.
You don't get to throw your scam into other peoples' faces and try to trick them, everyone should be made aware that you are trying to scam them.
And now Deca 3!? Is it AGI? Does it have millions of emotions? Tell me, tell me, what mad lies can you create?
Or perhaps you aren't creating anything or thinking at all, you are simply asking asking a LLM (not Deca clearly) what do. That is one unfortunate thing about LLMs they enable scammers like you without any competence to fabric a believable facade to those less aware people like you exist. Who will act so nice, so kind, so understanding, and be lying the whole time.
Spaceship is maybe from iceland.
If you want the truth:
Both Deca 2.5 and 3 are built on existing models. They will work. Whether it is worth it or not is totally up to you. But they will work.There's no winning strategy here for you.
Then why are you talking. If I'm here to fail let me fail. And if I'm here to succeed, do not hinder me.
Dude, are you actually missing layers in your wet brain? I don't need to hinder you for you to fail. You already did. However, scammers like you are making the reputation of our industry even worse than it is. So yes, I do have a problem with you, personally. Explaining what are LLM, what are neural nets, and how they can legitimately be useful, runs contrary to your very existence on this platform.
Your gross misunderstanding of everything technical just so you can scam users is the very reason this tech is so hated. As such, I despise you. And I'd heavily suggest you to blank your github history (and your socks), alongside your facebook, and email, just saying :D
Ok. I failed. Case closed?
"lol guys, fine i was a scammer, can i try again"
Please, consider lobotomy.
I still don't understand how it could possibly be a scam.
You don't understand much, we've been over that already. Now go away. Be happy I don't press what i found googling your github / email trace :)
Just don't come back. Reassess your life.
Ngl bullying a scammer was fun. (your mom looks nice, she wouldn't be proud of you, tho)
meta-thought: this person is clearly farming the attention they're getting from this and we should stop responding to them, not doxx them
half of their comments and all of their commits are vibe coded, there is clearly nothing of value, its not clear literally anything was trained
I think this argument is foolish, but it's understandable that only a handful of users on HF have human-level intelligence. Furthermore, the model's approach is likely dynamic routing, which I think the original poster's point is quite reasonable and entirely correct. However, dynamic routing differs significantly from dynamic MOE, a technical point that needs clarification.
I think this argument is foolish, but it's understandable that only a handful of users on HF have human-level intelligence. Furthermore, the model's approach is likely dynamic routing, which I think the original poster's point is quite reasonable and entirely correct. However, dynamic routing differs significantly from dynamic MOE, a technical point that needs clarification.
This.
Meanwhile, conversation locked and I’m putting this on LMArena (if they approve) to test