Sarath Shekkizhar

shekkizh

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

Salesforce's profile picture Tenyx's profile picture Blog-explorers's profile picture ZeroGPU Explorers's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

shekkizh's activity

replied to their post about 17 hours ago
view reply

Didn’t you know AGI is already here 🤖

replied to their post about 17 hours ago
view reply

Images are split into patches and each patch is tokenized - the tokenization is taking into a feature dimension and quantizing. This is probably already has CNN and/or attention. The issue is that of the model not able to reason both color and text in the tokenized space.

We ran about 1000 experiments - different prompting, tool call to different model for recognition, and several other techniques. The results still hold. The paper is a small part of the analysis.🤷‍♂️

posted an update 3 days ago
view post
Post
2314
🙋🏽‍♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨

I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.

Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences

❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.

🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.

What do you think? Curious what patterns you're experimenting with 🧐

👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/
  • 2 replies
·
posted an update 4 days ago
view post
Post
1817
Think AGI is just around the corner? Not so fast.

When OpenAI released its Computer-Using Agent (CUA) API, I happened to be playing Wordle 🧩 and thought, why not see how the model handles it?
Spoiler: Wordle turned out to be a surprisingly effective benchmark.
So Romain Cosentino Ph.D. and I dug in and analyzed the results of several hundred runs.

🔑 Takeaways
1️⃣ Even the best computer-using models struggle with simple, context-dependent tasks. 
2️⃣ Visual perception and reasoning remain major hurdles for multimodal agents.
3️⃣ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn 📉

🔗 Read our arxiv article for more details https://www.arxiv.org/abs/2504.15434
  • 3 replies
·
posted an update 18 days ago
view post
Post
604
Some interesting architectural choices made in Llama 4 models -- were these key to the 10M context? Possibly 🤔

🔍 Takeaways:
🧩 Interleaved Attention without position encoding
- LLaMA 4 removes explicit positional encoding in some attention layers to boost performance on longer contexts.
- The principles here could be similar to the residual connections to facilitate attention to early tokens without positional decay.

⚖️ Scaled Softmax to increase attention at inference time
- The max attention value (output of softmax) decreases as context size increases.
- Llama 4 incorporates a context-size dependent temperature in the softmax function to modify the slope of softmax, allowing the model to focus better on relevant tokens.
- Done only at inference time -- guessing it was more a choice after some observation on eval datasets.

What did you think of these choices?
New activity in openbmb/RLAIF-V-Dataset 11 months ago
reacted to their post with 🚀 12 months ago
view post
Post
1227
Hi folks,
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.

By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B
posted an update 12 months ago
view post
Post
1227
Hi folks,
Tenyx announced its latest model Llama3-TenyxChat-70B, which outperforms a GPT-4 variant on several MT-Bench measurements.

By post-training Llama-3 70B in 15 hours, our model improves reasoning capabilities leveraging the relationship between geometry and LLM task complexity (Take a look at our paper: https://arxiv.org/abs/2312.01648, to be presented at ICML 2024)
Model: tenyx/Llama3-TenyxChat-70B, HuggingFace Space: tenyx/Llama3-TenyxChat-70B
New activity in tenyx/Llama3-TenyxChat-70B 12 months ago

great evals

1
#2 opened 12 months ago by
gblazex
New activity in tenyx/Llama3-TenyxChat-70B 12 months ago