Third PoC Reactive Transformer attempt - increased dimensionality (x2) and model size (x4) compared to Micro-Plus series, but still synthetic data.
AI & ML interests
AGI, ASI, Reactive Awareness Models, Real-Time Reactive Language Models, Memory Systems, Reactive Neural Networks & Event-Driven AI
Recent Activity
View all activity
Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance
Second Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory (improved)
-
ReactiveAI/RxT-Alpha-Micro-Plus
Text Generation • 0.0B • Updated • 2 -
ReactiveAI/RxT-Alpha-Micro-Plus-SI-Supervised
Text Generation • 0.0B • Updated • 2 -
ReactiveAI/RxT-Alpha-Micro-Plus-I-Supervised
Text Generation • 0.0B • Updated -
ReactiveAI/RxT-Alpha-Micro-Plus-Decoder-SFT
Text Generation • 0.0B • Updated
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Datasets used for Interaction Supervised Fine-Tuning (SFT) of reactive models, made for real-time processing of single sequence (interaction)
Third PoC Reactive Transformer attempt - increased dimensionality (x2) and model size (x4) compared to Micro-Plus series, but still synthetic data.
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory (improved)
-
ReactiveAI/RxT-Alpha-Micro-Plus
Text Generation • 0.0B • Updated • 2 -
ReactiveAI/RxT-Alpha-Micro-Plus-SI-Supervised
Text Generation • 0.0B • Updated • 2 -
ReactiveAI/RxT-Alpha-Micro-Plus-I-Supervised
Text Generation • 0.0B • Updated -
ReactiveAI/RxT-Alpha-Micro-Plus-Decoder-SFT
Text Generation • 0.0B • Updated
Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance
Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Second Proof-of-Concept for Reactive Transformer architecture, that moves the working context/history from prompt to Short-Term Memory
Datasets used for Interaction Supervised Fine-Tuning (SFT) of reactive models, made for real-time processing of single sequence (interaction)