Bhimraj Yadav's picture

Bhimraj Yadav PRO

bhimrazy

·

https://bhimraj.com.np

AI & ML interests

Computer Vision, Healthcare, Generative AI and NLP

Recent Activity

upvoted a paper 4 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

upvoted a paper 5 days ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

upvoted a paper 5 days ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

View all activity

Organizations

bhimrazy's activity

upvoted a paper 4 days ago

Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos

Paper • 2501.13826 • Published 7 days ago • 21

upvoted 5 papers 5 days ago

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published 10 days ago • 84

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published 9 days ago • 79

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published 17 days ago • 89

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Paper • 2501.07171 • Published 17 days ago • 49

Agent Laboratory: Using LLM Agents as Research Assistants

Paper • 2501.04227 • Published 23 days ago • 85

upvoted a paper 6 days ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published 8 days ago • 75

upvoted a paper 7 days ago

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published 8 days ago • 270

upvoted a paper 10 days ago

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published 20 days ago • 42

upvoted a paper 16 days ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published 16 days ago • 271

upvoted a paper 17 days ago

Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension

Paper • 2411.13093 • Published Nov 20, 2024 • 1

upvoted 9 papers 18 days ago

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs

Paper • 2501.06186 • Published 20 days ago • 59

VideoRAG: Retrieval-Augmented Generation over Video Corpus

Paper • 2501.05874 • Published 20 days ago • 66

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Paper • 2501.01957 • Published 27 days ago • 42

Virgo: A Preliminary Exploration on Reproducing o1-like MLLM

Paper • 2501.01904 • Published 27 days ago • 31

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published 23 days ago • 48

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published 23 days ago • 67

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Paper • 2501.05366 • Published 21 days ago • 84

Enhancing Human-Like Responses in Large Language Models

Paper • 2501.05032 • Published 21 days ago • 49

Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though

Paper • 2501.04682 • Published 22 days ago • 90