Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation
Abstract
Medical SAM3 adapts the SAM3 foundation model through comprehensive fine-tuning on diverse medical imaging datasets to achieve robust prompt-driven segmentation across various modalities and anatomical structures.
Promptable segmentation foundation models such as SAM3 have demonstrated strong generalization capabilities through interactive and concept-based prompting. However, their direct applicability to medical image segmentation remains limited by severe domain shifts, the absence of privileged spatial prompts, and the need to reason over complex anatomical and volumetric structures. Here we present Medical SAM3, a foundation model for universal prompt-driven medical image segmentation, obtained by fully fine-tuning SAM3 on large-scale, heterogeneous 2D and 3D medical imaging datasets with paired segmentation masks and text prompts. Through a systematic analysis of vanilla SAM3, we observe that its performance degrades substantially on medical data, with its apparent competitiveness largely relying on strong geometric priors such as ground-truth-derived bounding boxes. These findings motivate full model adaptation beyond prompt engineering alone. By fine-tuning SAM3's model parameters on 33 datasets spanning 10 medical imaging modalities, Medical SAM3 acquires robust domain-specific representations while preserving prompt-driven flexibility. Extensive experiments across organs, imaging modalities, and dimensionalities demonstrate consistent and significant performance gains, particularly in challenging scenarios characterized by semantic ambiguity, complex morphology, and long-range 3D context. Our results establish Medical SAM3 as a universal, text-guided segmentation foundation model for medical imaging and highlight the importance of holistic model adaptation for achieving robust prompt-driven segmentation under severe domain shift. Code and model will be made available at https://github.com/AIM-Research-Lab/Medical-SAM3.
Community
๐ฅ Medical SAM3: Bridging the Gap in Text-Guided Medical Image Segmentation
Existing foundation models often face challenges when applying "segment anything" paradigms to medical imaging, particularly in the absence of spatial prompts (bounding boxes). Medical SAM3 aims to address this by enhancing the model's semantic understanding through full-parameter fine-tuning.
๐ก Key Contributions:
- ๐จ๏ธ Reduced Reliance on Spatial Cues: The model is trained to perform segmentation using solely text prompts (e.g., "Polyp", "Tumor"), aiming for a more automated workflow.
- ๐ Improved Generalization: Experiments on 7 unseen external datasets suggest a significant performance improvement in zero-shot settings (Dice score: 11.9% vs 73.9%).
- ๐ฉป Diverse Training Data: Developed on a corpus of 33 datasets across 10 imaging modalities to capture a wide range of medical semantics.
We hope this work contributes to the development of more robust, prompt-driven medical AI assistants.
arXivlens breakdown of this paper ๐ https://arxivlens.com/PaperView/Details/medical-sam3-a-foundation-model-for-universal-prompt-driven-medical-image-segmentation-9838-6fedc45e
- Executive Summary
- Detailed Breakdown
- Practical Applications
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MedSAM3: Delving into Segment Anything with Medical Concepts (2025)
- Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation (2025)
- SwinTF3D: A Lightweight Multimodal Fusion Approach for Text-Guided 3D Medical Image Segmentation (2025)
- MedVL-SAM2: A unified 3D medical vision-language model for multimodal reasoning and prompt-driven segmentation (2026)
- Comparing SAM 2 and SAM 3 for Zero-Shot Segmentation of 3D Medical Data (2025)
- Vision-Language Enhanced Foundation Model for Semi-supervised Medical Image Segmentation (2025)
- PPBoost: Progressive Prompt Boosting for Text-Driven Medical Image Segmentation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
arXiv explained breakdown of this paper ๐ https://arxivexplained.com/papers/medical-sam3-a-foundation-model-for-universal-prompt-driven-medical-image-segmentation
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper