Blessing Agyei Kyem 's picture

41 53

Blessing Agyei Kyem

Blessing

·

AI & ML interests

Data Science, Machine learning, Deep learning, Reinforcement learning

Recent Activity

upvoted a collection 1 day ago

upvoted an article 1 day ago

SmolLM3: smol, multilingual, long-context reasoner

upvoted a paper 3 days ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

View all activity

Organizations

None yet

upvoted a collection 1 day ago

NuExtract-2.0

9 items • Updated 24 days ago • 15

upvoted an article 1 day ago

Article

SmolLM3: smol, multilingual, long-context reasoner

By

and 22 others •

2 days ago

• 412

upvoted a paper 3 days ago

How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks

Paper • 2507.01955 • Published 8 days ago • 29

upvoted a collection 6 days ago

GLM-4.1V-Thinking

5 items • Updated 8 days ago • 41

upvoted a collection 7 days ago

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated 27 days ago • 144

upvoted a collection 8 days ago

ERNIE 4.5

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 23 items • Updated 7 days ago • 146

upvoted a paper 12 days ago

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Paper • 2401.12168 • Published Jan 22, 2024 • 29

upvoted an article 12 days ago

Article

Gemma 3n fully available in the open-source ecosystem!

By

and 7 others •

14 days ago

• 105

upvoted a collection 13 days ago

Gemma 3n

4 items • Updated about 15 hours ago • 163

upvoted a paper 16 days ago

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 107

upvoted a collection 26 days ago

InternVL3

34 items • Updated Apr 20 • 72

upvoted 3 collections about 1 month ago

VLAA-Thinker

6 items • Updated Apr 17 • 4

MiMo-VL

3 items • Updated 23 days ago • 30

DeepSeek-R1

10 items • Updated May 29 • 743

upvoted a paper about 2 months ago

Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models

Paper • 2505.17015 • Published May 22 • 9

upvoted an article about 2 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

By

and 6 others •

May 21

• 185

upvoted a paper about 2 months ago

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14 • 94

upvoted 2 collections about 2 months ago

Qwen2.5-Omni

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated May 21 • 148

Qwen2-VL

Vision-language model series based on Qwen2 • 16 items • Updated Apr 28 • 220

upvoted an article about 2 months ago

Article

Vision Language Models (Better, Faster, Stronger)

By

and 4 others •

May 12

• 474