What if I told you that LLM models do not simply predict the next token in a sequence but instead utilize an emergent structural pattern-based system to comprehend language and concepts? I created a graph-based optimizer that not only works, but it also actually beats Adam, like very badly. I prove it thoroughly using SMOL LLM models. The secret? The graph is not what you think it is, humans. Code, full explanation, and more in this video. The Rhizome Optimizer is MIT licensed. I have completed my research. I fully understand now.
I turned a CNN into a GNN, then I trained it to play video games. Yup, I used graphs as the visual interface to feed to the model, and it works! I also used the laws of conservation of energy but I can't prove the causation only the correlation there. I call the complete framework I had to build out to pull of this off 'NeuroGraphRL'. Bet you never thought I'd be using graphs as eyeballs did you? I never thought you would be using tokens as words, but here we are!