Data
Imageomics Datasets
Viewer • Updated • 1.14M • 481 • 7Note Kenyan Animal Behavior Recognition Dataset: Annotated drone videos of giraffes, plains zebras, and Grevy's zebras at the Mpala Research Centre.
imageomics/fish-vista
Viewer • Updated • 53k • 1.61k • 13Note Fish images for species classification, trait identification, and trait segmentation. An imbalanced dataset design for biological and evolutionary trait discovery, model interpretability, and weakly supervised semantic segmentation.
imageomics/VLM4Bio
Viewer • Updated • 34.4k • 200
imageomics/Heliconius-Collection_Cambridge-Butterfly
Viewer • Updated • 17.7k • 70Note Approximately 36K RGB images of 12K butterfly specimens collected by Chris Jiggins' research group at the University of Cambridge. Both dorsal and ventral images available. Contains primarily separated wings, with some whole butterfly images. There is variation in image content (whitestandard, background color, etc. based on the needs of the project). We added image-level Heliconius subspecies mimic group information to the entries.
imageomics/2018-NEON-beetles
Viewer • Updated • 8.8k • 633Note Collection of 577 images of ethanol-preserved ground beetles (family Carabidae) collected from various NEON sites in 2018 and photographed in batches in 2022. Each image contains a collection of beetles of the same species from a single plot at the labeled site, arranged on a lattice and photographed; the elytra length and width were then annotated for each individual in each image using Zooniverse.
imageomics/TreeOfLife-200M
Viewer • Updated • 214M • 2.18k • 12Note Nearly 214 million images representing 952,257 taxa across the tree of life used to train BioCLIP 2. This dataset combines images and metadata from four core biodiversity data providers: Global Biodiversity Information Facility (GBIF), Encyclopedia of Life (EOL), BIOSCAN-5M, and FathomNet to more than double the number of unique taxa covered by TreeOfLife-10M.
imageomics/TreeOfLife-10M
Viewer • Updated • 6.13M • 7.21k • 30Note Over 10 million images covering 454 thousand taxa in the tree of life used to train BioCLIP. This dataset of images of biological organisms paired with their associated taxonomic labels expands on the foundation established by existing high-quality datasets, such as iNat21 and BIOSCAN-1M, by further incorporating newly curated images from the Encyclopedia of Life (eol.org), which supplies most of TreeOfLife-10M’s data diversity.
imageomics/rare-species
Viewer • Updated • 12k • 2.43k • 13Note This dataset was generated alongside TreeOfLife-10M as a benchmark for BioCLIP; data (images and text) were pulled from Encyclopedia of Life (EOL) to generate a dataset consisting of rare species for zero-shot-classification and more refined image classification tasks. Here, we use "rare species" to mean species listed on The International Union for Conservation of Nature (IUCN) Red List as Near Threatened, Vulnerable, Endangered, Critically Endangered, and Extinct in the Wild.
imageomics/KABR-telemetry
Viewer • Updated • 42 • 110 • 1Note Drone telemetry data associated with the KABR dataset (annotated video behavior of zebras and giraffes at the Mpala Research Centre). This telemetry dataset contains information about the status drone during the missions, including location and altitude, along with the bounding box dimensions of the wildlife in the frame and behavior annotation information.
imageomics/Curated_GoldStandard_Hoyal_Cuthill
Viewer • Updated • 2.2k • 41 • 2Note Dorsal full body images of subspecies of Heliconius erato and Heliconius melpomene (18 subspecies total). There are 960 images with 320 specimens (3 images of each specimen: Original/ Bird transformed/ Butterfly transformed): original images are low-resolution RGB photographs (Hoyal Cuthill et al., 2019; Bird and Butterfly transformed were created using AcuityView with estimates of acuity from AcuityView 2.0 and (Land, 1997).
imageomics/questFish2024
Viewer • Updated • 4 • 355Note Images of fish collected from bodies of water near Princeton University for QUEST 2024.
imageomics/IDLE-OO-Camera-Traps
Viewer • Updated • 2.59k • 409Note IDLE-OO Camera Traps is a 5-dataset benchmark of camera trap images from the Labeled Information Library of Alexandria: Biology and Conservation (LILA BC) with a total of 2,586 images for species classification. Each of the 5 benchmarks is balanced to have the same number of images for each species within it (between 310 and 1120 images), representing between 16 and 39 species.
imageomics/char-sim-data
Updated • 24imageomics/plazi-biospecimen-img-descriptions
Preview • Updated • 18 • 2