Anas Awadalla's picture

Anas Awadalla

anas-awadalla

·

AI & ML interests

None yet

Recent Activity

new activity about 22 hours ago

mlfoundations/gelato-evals:Upload osworld-g-eval-refined.json with huggingface_hub

new activity about 22 hours ago

mlfoundations/gelato-evals:Upload osworld-g-eval.json with huggingface_hub

updated a collection 3 days ago

🍨 Gelato-30B-A3B Checkpoints

View all activity

Organizations

upvoted a collection 10 days ago

R-HORIZON

The training and evaluation datasets for Paper "How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?" • 6 items • Updated 11 days ago • 6

upvoted a collection 26 days ago

Qwen3-VL

37 items • Updated about 10 hours ago • 351

upvoted an article about 1 month ago

Article

ScreenEnv: Deploy your full stack Desktop Agent

Jul 10

• 72

upvoted 2 papers about 1 month ago

Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 112

LLM-I: LLMs are Naturally Interleaved Multimodal Creators

Paper • 2509.13642 • Published Sep 17 • 8

upvoted a paper 3 months ago

The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 84

upvoted a collection 4 months ago

WaveUI

WaveUI is a collection of datasets and tools to improve UI object detection • 6 items • Updated Jul 31, 2024 • 10

upvoted a paper 5 months ago

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 138

upvoted a paper 12 months ago

BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions

Paper • 2411.07461 • Published Nov 12, 2024 • 23

upvoted 4 papers about 1 year ago

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 43

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 63

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 100

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15, 2024 • 45

upvoted a collection about 1 year ago

XGen-MM-1 models and datasets

A collection of all XGen-MM (Foundation LMM) models! • 18 items • Updated 1 day ago • 39

upvoted 2 collections over 1 year ago

Gemma 2 2B Release

The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 10 • 80

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 14 items • Updated 10 days ago • 62

upvoted a paper over 1 year ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10, 2024 • 72

upvoted a collection over 1 year ago

4M Tokenizers

Multimodal tokenizers from https://4m.epfl.ch/ • 15 items • Updated Mar 7 • 6

upvoted 2 papers over 1 year ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 103

MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Paper • 2406.11271 • Published Jun 17, 2024 • 21