Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
AbstractPhil 
posted an update 4 days ago
Post
221
Cardinality cardinality CARDINALITY! As I restructure the wordnet's multi-definition structure, I've found a fair assessment capability that minimizes column recall requirement while simultaneously maximizing recall speed. So it will be fast.
Research shows, the most intelligent and most intellectually-driven LLMs require the most intelligent and carefully curated solid representative vocabularies - with the most intelligent and carefully curated training regiments.
Class simultaneously loaded hierarchical structures built with variants of vocabulary dimensions do not help this. Multiple dimensions of imagenet do not help this. Reshaping does not help. Solidification processes through pulverizing using Alucard do not help - though they did show some interesting potentials for pretraining the full geometric clip from the ground floor.
The experimentations with the multitude of clip features and imagenet - showcase that not only can this tiny 4meg classification tool can handle imagenet from clip features AT AROUND 76% no matter the hyperparams using linear, but expanding this system upward and including hundreds of different formula variants DOES NOT HELP SCALE IT AT ALL! The largest ones only house 76%, and the medium-sized ones house about 86% instead of 76% when using clip-vit-b-patch16 and clip-vit-b-patch32. If you check the big number valuations for the clip-vit-b laion and openai, you'll find nearly identical classifications.
So I only taught it, to understand geometry - the more training and more steps only brings it closer incorrectly.
So, this tells me one simple principle; geometry and linear have an upward capacity based on the information extracted from the linear model. Meaning... We need more places to extract and more curative potentials to solidify that access with, rather than simply EXPANDING it and making it bigger.
Next experiment includes a full cardinality subset of unicode to wordnet vocabulary translation matrices. Today. Within the hour.

Sorry the language on that one is pretty terrible.
My geometric research continues and I'm not slowing down. The imagenet initial tests are complete and the largest model is currently preparing to cook. This big model I've named Goliath - is still very small in comparison to most CLIP variants.
Goliath has vit-maxx pretrained layers - in other words i've taken layers clean from the model, and given geometric attention between the frozen layers allowing them to codify and galvanize with the geometry.
It's a series of teacher/student introduced layers that unfreeze subsequent additional layers to introduce geometric learning as a replacement option for vit's vocabulary.
It's working... somewhat. It definitely needs much much more distillation to be ready, but she's cooking.
vit-max-goliath
Being substantially larger than anything geometric - I'm using the vit-max-tiny. So it's already far far more than overkill when it's tuned.
https://github.com/google-research/maxvit based on the maxvit variant of vit.
I really don't expect too much in terms of accuracy boosts, but it should convert directly to geometry without a big fuss.
Trying to do this with one of the LAION based models is beyond my resources as the distillation would require a large array of text captions just for the text portion.

HOWEVER, imposing geometry on a singular highly-compacted vit shouldn't be too problematic in terms of logistics. Geometry learns quick, and they are already pretrained with imagenet so this should combine. When it works I'll have a blueprint for a proper encoder hybrid that should solidify the full clip-vit-geometric hybrid between openai, laion, and google vits, clips, and model variant distillation to teach proper geometry to a clip model that can produce geometric-tuned features.
I expect a proper geometric feature to allow these to reach 95%+ on imagenet when training a random instantiated baseline geometric head.

After that, imposing a full translation matrix between geometry and feature geometry should be something I can distill into any clip-vit or vit variant - assuming they're even SOMEWHAT compatible with the predecessors.

In this post