Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
inflatebot 
posted an update 12 days ago
Post
375
MN-12B-Mag-Mell inflatebot/MN-12B-Mag-Mell-R1 is a year old next week (Sept. 16)!
TL;DR: AMA in the comments!!

Over the last year, Mag Mell has enjoyed a consistent presence in the roleplay scene, with 100,000 downloads total on the full-precision weights, and likely more than that (although I don't control those repos so I can't see) on the quantized versions. (Not bad for a craft-produced merge spun up by someone who barely understood what a tensor was at the time...)

Since then, I've become a de-facto community manager for the allura-org organization, a group who I'm immensely proud of for what we've achieved in the last year - a space built by and for queer, trans and gender-nonconforming members of the AI space across different use cases. (This has taken up basically all my mental bandwidth, which is the primary reason I haven't done more LLMs! That and Factorio. IYKYK.)

When I picked my name, I expected to burn it in a couple months' time, but instead it's become something of a rent-lowering bulwark, attracting the type of people we want and deflecting the ones we don't, and I've felt more comfortable being my whole weirdo self in the space as a result. While this hasn't resulted in much that I can share on Hugging Face, it's regardless been a very fun and fulfilling time.

If anybody reading is a fan of MM or anything else that I do or just have something funny to ask, feel free to leave any questions in the comments. I may reply directly, or I may collect them into a blog post on the 16th. We'll see how the next week goes for me.

Thank you so much for everything!

Happy birthday to the best 12B of the last year! 🥳

Happy birthday to Mag Mell! It has been my go-to 12B for quite some time now. It's the "I can't be bothered to fiddle with settings, I just want to have a good time" model for me.

Q: What I like most about Mag Mell is that not only does it come with recommended sampler settings... but it's actually pretty good. How did you arrive at those settings? I personally wouldn't have turned up min-p to 0.2, but it works well for some reason—responses still have variety.

·

It's basically just the Universal-Light preset with min_p turned up a skosh. This addition helped amend some of the token leakage, which in hindsight, was most likely caused by mixing models with different tokenization bases.

Mistral Nemo is a weird beast, its tokenizer was poorly understood at launch and the Anthracite group released a version of it with its formatting tokens replaced with those from ChatML. Mag Mell combined models based on that (notably Magnum itself from which it gets part of its name) and models based on the original instruction-following model. I may have done something wrong with the MergeKit config, but Toasty and I couldn't hammer it out.

Anyways, we ended up recommending MinP of 0.2 because it keeps the character of the replies while preventing the majority of token problems out to a sufficient context length to fit a whole scene into.

how does a fish when it swims? how does a mag when it mells? these questions have historians for generations (of LLMs)

·

IMG_4123.jpg