Detect and annotate poses in images and videos
Ask questions about images and get detailed answers
VLMEvalKit Evaluation Results Collection
Segment images using texts, points, or everything mode