Very fast and powerful, but with one glaring weakness.

#6
by phil111 - opened

This is by far the most powerful AI model I've ever used at this speed. It writes reasonably eloquent stories with minimal contradictions in response to complex prompts filled with a list of inclusions and exclusions, plus it got tough STEM questions right, such as Thorne–Żytkow objects. It's only major weakness, besides needlessly excessive alignment, appears to be general knowledge (e.g. popular movies, games, music, TV shows, sports...), which causes it to hallucinate like crazy, even at very low temperatures. Smaller (but slower) models like Mistral Small 2409 24b have far more broad knowledge.

idea in MOEs is to exchange experts with req. knowledge (also finetune experts independently) ,this main feature is not seen yet in any inference player

@21world Well I hope they find a way to improve the MOE design because this model is surprisingly competent and good at following instructions for its speed, but is just too broadly ignorant to serve as a general purpose LLM.

Frankly, the hallucination rate of LLMs when it comes to very popular common knowledge about things like top movies and music is too damn high, especially considering the space required to accurate hold said popular data is only measured in 100s of megabytes, which is far smaller than these models. It seems to me that the only feasible solution to the hallucinations plaguing LLMs is to seamlessly fuse them with hallucination free databases of humanity's core knowledge rather than sloppy and slow RAG implementations.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment