SenseTime X-Lab

company

Sense-X

Activity Feed Request to join this org

AI & ML interests

Computer vision, Reinforcement learning

Recent Activity

Andy1621 authored a paper about 1 month ago

Emerging Properties in Unified Multimodal Pretraining

Andy1621 authored a paper about 1 month ago

Unmasked Teacher: Towards Training-Efficient Video Foundation Models

Andy1621 authored a paper about 1 month ago

Harvest Video Foundation Models via Efficient Post-Pretraining

View all activity

Andy1621

authored 17 papers about 1 month ago

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning

Paper • 2201.04676 • Published Jan 12, 2022

UniFormer: Unifying Convolution and Self-attention for Visual Recognition

Paper • 2201.09450 • Published Jan 24, 2022

You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction

Paper • 2205.14871 • Published May 30, 2022

UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Paper • 2211.09552 • Published Nov 17, 2022

InternVideo: General Video Foundation Models via Generative and Discriminative Learning

Paper • 2212.03191 • Published Dec 6, 2022

MUSES: 3D-Controllable Image Generation via Multi-Modal Agent Collaboration

Paper • 2408.10605 • Published Aug 20, 2024 • 1

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Paper • 2410.19702 • Published Oct 25, 2024

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Paper • 2501.00574 • Published Dec 31, 2024 • 6

Make Your Training Flexible: Towards Deployment-Efficient Video Models

Paper • 2503.14237 • Published Mar 18 • 5

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 145

Andy1621

authored a paper 6 months ago

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Paper • 2412.19326 • Published Dec 26, 2024 • 18

akhaliq

posted an update 6 months ago

Post

21911

Google drops Gemini 2.0 Flash Thinking

a new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and more

now available in anychat, try it out: https://huggingface.co/spaces/akhaliq/anychat

4 replies

Andy1621

authored a paper 6 months ago

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23

AI & ML interests

Recent Activity

Team members 2

Sense-X's activity