TY - GEN
T1 - BioVFM-21M
T2 - 3rd International Workshop on Foundation Models for Medical Artificial General Intelligence, MedAGI 2025, Held in Conjunction with the 28th International conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
AU - Liu, Jiarun
AU - Zhou, Hong Yu
AU - Huang, Weijian
AU - Yang, Hao
AU - Song, Dongning
AU - Tan, Tao
AU - Liang, Yong
AU - Wang, Shanshan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Scaling up model and data size have demonstrated impressive improvement over a wide range of tasks. Despite extensive studies on scaling behaviors for general-purpose tasks, medical images exhibit substantial differences from natural data. It remains unclear the key factors in developing medical vision foundation models at scale. In this paper, we explored the scaling behavior across model sizes, training algorithms, data sizes, and imaging modalities in developing scalable medical vision foundation models by self-supervised learning. To support scalable pretraining, we introduce BioVFM-21M, a large-scale biomedical image dataset encompassing a wide range of biomedical image modalities and anatomies. We observed that scaling up does provide benefits but varies across tasks. Additional analysis reveals several factors correlated with scaling benefits. Finally, we propose BioVFM, a large-scale medical vision foundation model pretrained on 21 million biomedical images, which outperforms the previous state-of-the-art foundation models across 12 medical benchmarks. Our results highlight that while scaling up is beneficial for pursuing better performance, task characteristics, data diversity, pretraining methods, and computational efficiency remain critical considerations for developing scalable medical foundation models. We will open the dataset, model, and algorithms of this study at GitHub.
AB - Scaling up model and data size have demonstrated impressive improvement over a wide range of tasks. Despite extensive studies on scaling behaviors for general-purpose tasks, medical images exhibit substantial differences from natural data. It remains unclear the key factors in developing medical vision foundation models at scale. In this paper, we explored the scaling behavior across model sizes, training algorithms, data sizes, and imaging modalities in developing scalable medical vision foundation models by self-supervised learning. To support scalable pretraining, we introduce BioVFM-21M, a large-scale biomedical image dataset encompassing a wide range of biomedical image modalities and anatomies. We observed that scaling up does provide benefits but varies across tasks. Additional analysis reveals several factors correlated with scaling benefits. Finally, we propose BioVFM, a large-scale medical vision foundation model pretrained on 21 million biomedical images, which outperforms the previous state-of-the-art foundation models across 12 medical benchmarks. Our results highlight that while scaling up is beneficial for pursuing better performance, task characteristics, data diversity, pretraining methods, and computational efficiency remain critical considerations for developing scalable medical foundation models. We will open the dataset, model, and algorithms of this study at GitHub.
KW - Benchmarking
KW - Correlation analysis
KW - Foundation models
KW - Large medical dataset
KW - Scaling law
KW - Self-supervised learning
UR - https://www.scopus.com/pages/publications/105020018748
U2 - 10.1007/978-3-032-07845-2_3
DO - 10.1007/978-3-032-07845-2_3
M3 - Conference contribution
AN - SCOPUS:105020018748
SN - 9783032078445
T3 - Lecture Notes in Computer Science
SP - 23
EP - 33
BT - Foundation Models for General Medical AI - 3rd International Workshop, MedAGI 2025, Held in Conjunction with MICCAI 2025, Proceedings
A2 - Jeong, Won-Ki
A2 - Kim, Hyunwoo J.
A2 - Deng, Zhongying
A2 - Shen, Yiqing
A2 - Aviles-Rivero, Angelica I
A2 - Zhang, Shaoting
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 27 September 2025 through 27 September 2025
ER -