Mohammed Mohammed Ali

MohammedEltoum

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

reacted to hesamation's post with ❤️ 8 days ago

Google published a 69-page whitepaper on Prompt Engineering and its best practices, a must-read if you are using LLMs in production: > zero-shot, one-shot, few-shot > system prompting > chain-of-thought (CoT) > ReAct LINK: https://www.kaggle.com/whitepaper-prompt-engineering > code prompting > best practices

upvoted a paper 10 days ago

SmolVLM: Redefining small and efficient multimodal models

View all activity

Organizations

MohammedEltoum's activity

upvoted a paper 7 days ago

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Paper • 2504.01990 • Published 17 days ago • 242

upvoted a paper 10 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 10 days ago • 160

upvoted a paper 25 days ago

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published 27 days ago • 22

upvoted a paper 26 days ago

One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation

Paper • 2503.13358 • Published Mar 17 • 95

upvoted a collection about 2 months ago

olmOCR

Collection

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 4 items • Updated 29 days ago • 104

upvoted 3 papers 2 months ago

upvoted a paper 3 months ago

PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding

Paper • 2501.16411 • Published Jan 27 • 18

upvoted a paper 5 months ago

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 31

upvoted a paper 6 months ago

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Paper • 2410.03450 • Published Oct 4, 2024 • 37

upvoted a collection 6 months ago

Molmo

Collection

Artifacts for open multimodal language models. • 5 items • Updated Mar 13 • 302

upvoted a paper 6 months ago

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Paper • 2409.19291 • Published Sep 28, 2024 • 19

upvoted a collection 7 months ago

Emu3

Collection

Emu3: Next-Token Prediction is All You Need • 7 items • Updated Feb 13 • 70

upvoted 4 papers 7 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 114

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Paper • 2409.08513 • Published Sep 13, 2024 • 14

ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds

Paper • 2409.09213 • Published Sep 13, 2024 • 13

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Paper • 2408.02555 • Published Aug 5, 2024 • 33