Jiannan Huang's picture

1 5 16

Jiannan Huang

Rbrq

·

http://rbrq03.github.io

AI & ML interests

None yet

Recent Activity

liked a dataset about 2 months ago

yandex/alchemist

updated a dataset 2 months ago

Rbrq/odd

published a dataset 3 months ago

Rbrq/odd

View all activity

Organizations

None yet

liked a dataset about 2 months ago

yandex/alchemist

Viewer • Updated Jun 6 • 3.35k • 642 • 41

updated a dataset 2 months ago

Rbrq/odd

Updated May 5 • 132

published a dataset 3 months ago

Rbrq/odd

Updated May 5 • 132

liked a dataset 4 months ago

CaptionEmporium/conceptual-captions-cc12m-llavanext

Viewer • Updated Jun 30, 2024 • 11M • 265 • 20

upvoted a paper 4 months ago

Video-T1: Test-Time Scaling for Video Generation

Paper • 2503.18942 • Published Mar 24 • 89

liked 2 datasets 4 months ago

sayakpaul/coco-30-val-2014

Viewer • Updated Feb 5, 2024 • 30k • 455 • 11

jackyhate/text-to-image-2M

Viewer • Updated Sep 22, 2024 • 649k • 2.84k • 116

liked a model 5 months ago

AiArtLab/waifu-2b

Text-to-Image • Updated Jan 30 • 11 • 7

liked a model 6 months ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated 20 days ago • 1.44M • • 10.9k

liked a Space 8 months ago

Image to Drawing

Convert images to drawings with complex lines

liked a model 10 months ago

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Jan 12 • 1.04M • 433

liked a Space 10 months ago

Open VLM Leaderboard

VLMEvalKit Evaluation Results Collection

liked a model 10 months ago

allenai/Molmo-7B-D-0924

Image-Text-to-Text • 8B • Updated Apr 4 • 166k • 536

liked a dataset 11 months ago

THUDM/ImageRewardDB

Updated Jun 21, 2023 • 389 • 41

upvoted a paper 11 months ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 88

upvoted an article 11 months ago

Article

MobileNet Baselines

By

•

Jul 26, 2024

• 24

liked a model 11 months ago

CiaraRowles/IP-Adapter-Instruct

Image-to-Image • Updated Aug 13, 2024 • 110 • 51

updated a model about 1 year ago

Rbrq/detr-finetuned-cppe-5-10k-steps

Updated Jul 14, 2024

reacted to merve's post with 🔥 about 1 year ago

Post

5175

Real-time DEtection Transformer (RT-DETR) landed in transformers 🤩 with Apache 2.0 license 😍

🔖 models:

PekingU
🔖 demo: merve/RT-DETR-tracking-coco
📝 paper: DETRs Beat YOLOs on Real-time Object Detection (2304.08069)
📖 notebook: https://github.com/merveenoyan/example_notebooks/blob/main/RT_DETR_Notebook.ipynb

YOLO models are known to be super fast for real-time computer vision, but they have a downside with being volatile to NMS 🥲

Transformer-based models on the other hand are computationally not as efficient 🥲

Isn't there something in between? Enter RT-DETR!

The authors combined CNN backbone, multi-stage hybrid decoder (combining convs and attn) with a transformer decoder. In the paper, authors also claim one can adjust speed by changing decoder layers without retraining altogether.
The authors find out that the model performs better in terms of speed and accuracy compared to the previous state-of-the-art. 🤩

updated a model about 1 year ago

Rbrq/detr_finetuned_cppe5

Object Detection • 0.0B • Updated Jul 8, 2024 • 5