Stoney Kang

sikang99

AI & ML interests

Remote Control based on Vision

Recent Activity

upvoted a paper about 5 hours ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

upvoted a paper about 15 hours ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

upvoted a paper 1 day ago

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

View all activity

Organizations

sikang99's activity

upvoted a paper about 5 hours ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 1 day ago • 172

upvoted a paper about 15 hours ago

PRIMA.CPP: Speeding Up 70B-Scale LLM Inference on Low-Resource Everyday Home Clusters

Paper • 2504.08791 • Published 9 days ago • 95

upvoted a paper 1 day ago

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Paper • 2504.07615 • Published 6 days ago • 19

upvoted a paper 2 days ago

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 4 days ago • 101

upvoted 5 papers 4 days ago

upvoted 2 papers 8 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 8 days ago • 158

Scene-Centric Unsupervised Panoptic Segmentation

Paper • 2504.01955 • Published 13 days ago • 5

upvoted 2 papers 13 days ago

Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features

Paper • 2504.00557 • Published 15 days ago • 15

Scaling Language-Free Visual Representation Learning

Paper • 2504.01017 • Published 14 days ago • 26

upvoted a paper 14 days ago

Video-R1: Reinforcing Video Reasoning in MLLMs

Paper • 2503.21776 • Published 19 days ago • 76

upvoted 3 papers 15 days ago

Challenges and Paths Towards AI for Software Engineering

Paper • 2503.22625 • Published 18 days ago • 3

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

Paper • 2503.22236 • Published 19 days ago • 11

Segment Any Motion in Videos

Paper • 2503.22268 • Published 19 days ago • 17

upvoted 3 papers 20 days ago

BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Paper • 2503.20672 • Published 20 days ago • 13

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published 21 days ago • 23

Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published 21 days ago • 134