Marcos Henrique

wakeupmh

wakeupmh

AI & ML interests

None yet

Recent Activity

upvoted a paper 14 days ago

Training for X-Ray Vision: Amodal Segmentation, Amodal Content Completion, and View-Invariant Object Representation from Multi-Camera Video

upvoted a paper 14 days ago

MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning

upvoted a paper 14 days ago

Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention

View all activity

Organizations

upvoted 3 papers 14 days ago

Training for X-Ray Vision: Amodal Segmentation, Amodal Content Completion, and View-Invariant Object Representation from Multi-Camera Video

Paper • 2507.00339 • Published 16 days ago • 10

MARBLE: A Hard Benchmark for Multimodal Spatial Reasoning and Planning

Paper • 2506.22992 • Published 18 days ago • 12

Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention

Paper • 2506.23542 • Published 17 days ago • 14

upvoted a paper 15 days ago

BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing

Paper • 2506.17450 • Published 26 days ago • 62

upvoted 7 papers 24 days ago

PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

Paper • 2506.05573 • Published Jun 5 • 71

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 114

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published Jun 8 • 108

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 261

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Paper • 2506.08343 • Published Jun 10 • 49

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

Paper • 2506.14965 • Published 29 days ago • 47

DoTA-RAG: Dynamic of Thought Aggregation RAG

Paper • 2506.12571 • Published Jun 14 • 49

upvoted an article about 1 month ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

and 8 others •

Jun 3

• 202

upvoted an article about 2 months ago

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

and 6 others •

May 21

• 187

upvoted an article 2 months ago

Article

Blazingly fast whisper transcriptions with Inference Endpoints

and 5 others •

May 13

• 71

upvoted an article 4 months ago

Article

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!

and 1 other •

Mar 7

• 70

Marcos Henrique

AI & ML interests

Recent Activity

Organizations

wakeupmh's activity

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

nanoVLM: The simplest repository to train your VLM in pure PyTorch

Blazingly fast whisper transcriptions with Inference Endpoints

LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone!