Didn’t you know AGI is already here 🤖
Sarath Shekkizhar
shekkizh
AI & ML interests
None yet
Recent Activity
replied to
their
post
about 15 hours ago
🙋🏽♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨
I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.
Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences
❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.
🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.
What do you think? Curious what patterns you're experimenting with 🧐
👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/
replied to
their
post
about 15 hours ago
Think AGI is just around the corner? Not so fast.
When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it?
Spoiler: Wordle turned out to be a surprisingly effective benchmark.
So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs.
🔑 Takeaways
1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks.
2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents.
3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉
🔗 Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434
posted
an
update
3 days ago
🙋🏽♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨
I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.
Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences
❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.
🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.
What do you think? Curious what patterns you're experimenting with 🧐
👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/
Organizations
shekkizh's activity

replied to
their
post
about 15 hours ago

replied to
their
post
about 15 hours ago
Images are split into patches and each patch is tokenized - the tokenization is taking into a feature dimension and quantizing. This is probably already has CNN and/or attention. The issue is that of the model not able to reason both color and text in the tokenized space.
We ran about 1000 experiments - different prompting, tool call to different model for recognition, and several other techniques. The results still hold. The paper is a small part of the analysis.🤷♂️

posted
an
update
3 days ago
Post
2299
🙋🏽♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨
I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.
Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences
❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.
🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.
What do you think? Curious what patterns you're experimenting with 🧐
👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/
I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.
Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences
❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.
🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.
What do you think? Curious what patterns you're experimenting with 🧐
👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/

posted
an
update
4 days ago
Post
1813
Think AGI is just around the corner? Not so fast.
When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it?
Spoiler: Wordle turned out to be a surprisingly effective benchmark.
So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs.
🔑 Takeaways
1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks.
2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents.
3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉
🔗 Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434
When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it?
Spoiler: Wordle turned out to be a surprisingly effective benchmark.
So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs.
🔑 Takeaways
1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks.
2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents.
3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉
🔗 Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434

posted
an
update
18 days ago
Post
604
Some interesting architectural choices made in Llama 4 models -- were these key to the 10M context? Possibly 🤔
🔍 Takeaways:
🧩 Interleaved Attention without position encoding
- LLaMA 4 removes explicit positional encoding in some attention layers to boost performance on longer contexts.
- The principles here could be similar to the residual connections to facilitate attention to early tokens without positional decay.
⚖️ Scaled Softmax to increase attention at inference time
- The max attention value (output of softmax) decreases as context size increases.
- Llama 4 incorporates a context-size dependent temperature in the softmax function to modify the slope of softmax, allowing the model to focus better on relevant tokens.
- Done only at inference time -- guessing it was more a choice after some observation on eval datasets.
What did you think of these choices?
🔍 Takeaways:
🧩 Interleaved Attention without position encoding
- LLaMA 4 removes explicit positional encoding in some attention layers to boost performance on longer contexts.
- The principles here could be similar to the residual connections to facilitate attention to early tokens without positional decay.
⚖️ Scaled Softmax to increase attention at inference time
- The max attention value (output of softmax) decreases as context size increases.
- Llama 4 incorporates a context-size dependent temperature in the softmax function to modify the slope of softmax, allowing the model to focus better on relevant tokens.
- Done only at inference time -- guessing it was more a choice after some observation on eval datasets.
What did you think of these choices?
Post
1227
Hi folks,
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.
By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.
By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B

posted
an
update
12 months ago
Post
1227
Hi folks,
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.
By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.
By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B