mradermacher/model_requests

6 days ago

I'm not sure if this is the forum for questions. If it isn't, please disregard with my apologies.

Will we see another boom in 100b+ models, or are the big boys going extinct; thoughts?

6 days ago

At the moment we are quantizing Agatha-111B-v1 which was released yesterday and all the MLA based models such as DeepSeek-R1-0528, DeepSeek-V3-abliterated and r1-1776 all of which are above 100B. We also had quite some Llama 4 Scout finetunes in the past weeks all of which are above 100B as well. We are already looking forward on doing the dots.llm1 based models as soon they are supported by llama.cpp. There also is the soon to be released 2T parameter Llama 4 Behemoth. I don't think models above 100B are going anywhere. They are just not that popular as finetuning them is quite expensive. As AstroSage-70B showed there also is not really a need to go above 70B for domain specific models. It’s also worth mentioning that models on HuggingFace always seem to follow a seasonal pattern and come in waves. The most models usually get released in the first quarter of the year. If anything, models seem to get larger over time. I feel like we are having way more 70B models than in the past.

mradermacher

Owner 6 days ago

i think we see is the opposite of a boom for all models for a month now, and especially finetunes. but indeed, in the past it has been very cyclical, and also driven by releases of appropriately-sized llama releases :)

Rednero

5 days ago

Thank you for the input and perspective and thank you for all of your work!!

Rednero changed discussion status to closed 5 days ago

mradermacher
/

model_requests

Question