Describe Anything Model

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

richardaecn authored a paper about 16 hours ago

Unified Visual Relationship Detection with Vision and Language Models

richardaecn authored a paper about 16 hours ago

The iNaturalist Species Classification and Detection Dataset

richardaecn authored a paper about 16 hours ago

Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

View all activity

DescribeAnythingModel's activity

richardaecn

authored 13 papers about 16 hours ago

Unified Visual Relationship Detection with Vision and Language Models

Paper • 2303.08998 • Published Mar 16, 2023

The iNaturalist Species Classification and Detection Dataset

Paper • 1707.06642 • Published Jul 20, 2017

Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

Paper • 2305.06324 • Published May 10, 2023 • 1

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

Paper • 2004.12276 • Published Apr 26, 2020 • 1

Spatiotemporal Contrastive Video Representation Learning

Paper • 2008.03800 • Published Aug 9, 2020

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

Paper • 2012.07177 • Published Dec 13, 2020

A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models

Paper • 2302.06235 • Published Feb 13, 2023

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 30

Edify 3D: Scalable High-Quality 3D Asset Generation

Paper • 2411.07135 • Published Nov 11, 2024

Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning

Paper • 2503.15558 • Published 6 days ago • 35

longlian

authored a paper 5 days ago

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Paper • 2503.12355 • Published 9 days ago • 11

richardaecn

authored a paper 8 months ago

Wolf: Captioning Everything with a World Summarization Framework

Paper • 2407.18908 • Published Jul 26, 2024 • 32

richardaecn

authored a paper 11 months ago

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation

Paper • 2404.19752 • Published Apr 30, 2024 • 24

longlian

authored a paper about 1 year ago

Rethinking Patch Dependence for Masked Autoencoders

Paper • 2401.14391 • Published Jan 25, 2024 • 26

longlian

authored 2 papers over 1 year ago

LLM-grounded Video Diffusion Models

Paper • 2309.17444 • Published Sep 29, 2023 • 2

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Paper • 2304.08025 • Published Apr 17, 2023 • 2

longlian

authored a paper almost 2 years ago

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Paper • 2305.13655 • Published May 23, 2023 • 7

AI & ML interests

Recent Activity

Team members 2

DescribeAnythingModel's activity