Molecule Detection YOLO in MolParser

From paper: "MolParser: End-to-end Visual Recognition of Molecule Structures in the Wild" (ICCV2025 under review)

We provide several ultralytics YOLO11 weights for molecule detection with different size & input resolution.

General molecule structure detection models

moldet_yolo11[size]_640_general.pt

YOLO11 weights trained on 35k human annotated image crops and 100k generated images

640x640 input resolution
support handwritten molecules
multiscale input (inputs can be single/multiple molecular cutouts, reaction or table cutouts, or single-page PDF images)

Warning: For single-molecule input (used as a classification model), appropriate padding can be added to enhance the performance.

Result in private testing:

Model Size	mAP50	mAP50-95	Speed (T4 TensorRT10)
n	0.9581	0.8524	1.5 ± 0.0 ms
s	0.9652	0.8704	2.5 ± 0.1 ms
m	0.9686	0.8736	4.7 ± 0.1 ms
l	0.9891	0.9028	6.2 ± 0.1 ms

usage:

from ultralytics import YOLO
model = YOLO("moldet_yolo11l_640_general.pt")
model.predict("path/to/image.png", save=True, imgsz=640, conf=0.5)

moldet_yolo11[size]_960_doc.pt

YOLO11 weights trained on 26k human annotated PDF pages (patents, papers, and books)

Warning: It is recommended to use MuPDF to render PDF pages at more than 144dpi.

Result in private testing:

Model Size	mAP50	mAP50-95	Speed (T4 TensorRT10)
n	0.9871	0.8732	3.1 ± 0.0 ms
s	0.9851	0.8824	5.5 ± 0.1 ms
m	0.9867	0.8917	9.9 ± 0.2 ms
l	0.9913	0.9011	13.1 ± 0.3 ms

usage:

from ultralytics import YOLO
model = YOLO("moldet_yolo11l_960_doc.pt")
model.predict("path/to/pdf_page_image.png", save=True, imgsz=960, conf=0.5)