Question
I'm not sure if this is the forum for questions. If it isn't, please disregard with my apologies.
Will we see another boom in 100b+ models, or are the big boys going extinct; thoughts?
At the moment we are quantizing Agatha-111B-v1
which was released yesterday and all the MLA based models such as DeepSeek-R1-0528
, DeepSeek-V3-abliterated
and r1-1776
all of which are above 100B. We also had quite some Llama 4 Scout finetunes in the past weeks all of which are above 100B as well. We are already looking forward on doing the dots.llm1 based models as soon they are supported by llama.cpp. There also is the soon to be released 2T parameter Llama 4 Behemoth. I don't think models above 100B are going anywhere. They are just not that popular as finetuning them is quite expensive. As AstroSage-70B
showed there also is not really a need to go above 70B for domain specific models. It’s also worth mentioning that models on HuggingFace always seem to follow a seasonal pattern and come in waves. The most models usually get released in the first quarter of the year. If anything, models seem to get larger over time. I feel like we are having way more 70B models than in the past.
i think we see is the opposite of a boom for all models for a month now, and especially finetunes. but indeed, in the past it has been very cyclical, and also driven by releases of appropriately-sized llama releases :)
Thank you for the input and perspective and thank you for all of your work!!