MUSE: Multi-Scale Dense Self-Distillation for Nucleus Detection and Classification
This is the official Hugging Face repository for MUSE: "Multi-Scale Dense Self-Distillation for Nucleus Detection and Classification".
News
- β¨οΈ [2025-12]: Release the pre-trained weights and evaluation code. π
- β¨οΈ [2025-11]: Accepted to AAAI 2026! πππ
Introduction
In this work, we propose MUSE (MUlti-scale denSE self-distillation), a novel self-supervised learning method tailored for NDC. At its core is NuLo (Nucleus-based Local self-distillation), a coordinate-guided mechanism that enables flexible local self-distillation based on predicted nucleus positions. By removing the need for strict spatial alignment between augmented views, NuLo allows critical cross-scale alignment, thus unlocking the capacity of models for fine-grained nucleus-level representation. To support MUSE, we design a simple yet effective encoder-decoder architecture and a large field-of-view semi-supervised fine-tuning strategy that together maximize the value of unlabeled pathology images. Extensive experiments on three widely used benchmarks demonstrate that MUSE effectively addresses the core challenges of histopathological NDC. The resulting models not only surpass state-of-the-art supervised baselines but also outperform generic pathology foundation models.
Pre-Trained Models
This repository contains the pre-trained models of MUSE.
| Model | pre-trained weights |
|---|---|
| MUSE (ResNet-50) | r50-224.pth |
| MUSE (ViT-S/16) | vit_s_16-224.pth |
| MUSE (ViT-B/16) | vit_b_16-224.pth |
| LFoV-MUSE (ResNet-50) | r50-512.pth |
| LFoV-MUSE (ViT-S/16) | vit_s_16-512.pth |
| LFoV-MUSE (ViT-B/16) | vit_b_16-512.pth |
Please ref to the GitHub for more details.
Evaluation
KNN Evalation
| Method | BRCAM2C (20x) | OCELOT (20x) | PUMA (20x) | BRCAM2C (40x) | OCELOT (40x) | PUMA (40x) |
|---|---|---|---|---|---|---|
| MUSE (ResNet-50) | 88.37 | 85.51 | 81.21 | 85.78 | 83.49 | 78.60 |
| MUSE (ViT-S/16) | 86.88 | 86.13 | 80.00 | 87.67 | 85.45 | 79.71 |
| MUSE (ViT-B/16) | 87.56 | 85.90 | 81.26 | 88.11 | 85.55 | 81.19 |
| LFoV-MUSE (ResNet-50) | 89.53 | 86.21 | 82.21 | 87.44 | 85.18 | 79.88 |
| LFoV-MUSE (ViT-S/16) | 85.47 | 84.17 | 79.15 | 86.00 | 84.95 | 79.82 |
| LFoV-MUSE (ViT-B/16) | 89.03 | 87.38 | 81.11 | 88.93 | 85.52 | 83.16 |
Linear Probing Evaluation
| Method | BRCAM2C (20x) | OCELOT (20x) | PUMA (20x) | BRCAM2C (40x) | OCELOT (40x) | PUMA (40x) |
|---|---|---|---|---|---|---|
| MUSE (ResNet-50) | 88.14 | 85.57 | 81.53 | 87.39 | 83.65 | 80.64 |
| MUSE (ViT-S/16) | 87.79 | 85.42 | 81.34 | 89.66 | 85.20 | 80.17 |
| MUSE (ViT-B/16) | 89.60 | 85.82 | 83.29 | 88.86 | 85.57 | 82.48 |
| LFoV-MUSE (ResNet-50) | 90.18 | 86.19 | 83.85 | 88.86 | 85.78 | 82.76 |
| LFoV-MUSE (ViT-S/16) | 87.06 | 87.21 | 84.22 | 86.63 | 86.57 | 83.53 |
| LFoV-MUSE (ViT-B/16) | 89.20 | 86.10 | 84.36 | 90.18 | 86.43 | 85.12 |
Fine-Tuning Evaluation
| Method | BRCAM2C (20x) | OCELOT (20x) | PUMA (20x) | BRCAM2C (40x) | OCELOT (40x) | PUMA (40x) |
|---|---|---|---|---|---|---|
| MUSE (ResNet-50) | 86.29 | 86.30 | 81.69 | 88.26 | 84.85 | 80.42 |
| MUSE (ViT-S/16) | 86.40 | 86.21 | 83.09 | 88.56 | 86.40 | 80.79 |
| MUSE (ViT-B/16) | 88.43 | 86.03 | 84.18 | 89.60 | 86.87 | 82.46 |
| LFoV-MUSE (ResNet-50) | 88.70 | 87.87 | 84.49 | 89.74 | 85.17 | 82.62 |
| LFoV-MUSE (ViT-S/16) | 86.29 | 87.54 | 83.81 | 86.59 | 88.01 | 84.56 |
| LFoV-MUSE (ViT-B/16) | 89.29 | 87.05 | 84.84 | 90.26 | 87.87 | 85.74 |
License
This repository is released under the Apache 2.0 license.
Citation
If you find the code and pre-trained models useful for your research, please consider citing our paper. π
@article{yang2025muse,
title={MUSE: Multi-Scale Dense Self-Distillation for Nucleus Detection and Classification},
author={Yang, Zijiang and Chao, Hanqing and Zhao, Bokai and Yang, Yelin and Zhang, Yunshuo and Fu, Dongmei and Zhang, Junping and Lu, Le and Yan, Ke and Jin, Dakai and others},
journal={arXiv preprint arXiv:2511.05170},
year={2025}
}