Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory (improved)
AI & ML interests
AGI, Reactive Awareness Models, Memory Systems, Reactive Neural Networks
Recent Activity
View all activity
Second Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory (improved)
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Second Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Datasets used for Interaction Supervised Fine-Tuning (SFT) of reactive models, made for real-time processing of single sequence (interaction)
Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance