Semi-Supervised Breast Ultrasound Segmentation via Text-Guided Foundation Model

Šimon Freivald

Supervisor(s): Ing. Igor Jánoš, PhD

Slovak Technical University


Abstract: Breast cancer remains the leading cause of cancer-related mortality among women worldwide. Ultrasound imaging has become a preferred modality for breast screening due to its non-invasive nature, absence of ionizing radiation, and low cost. However, accurate segmentation of breast lesions in ultrasound images remains challenging. Furthermore, the scarcity of pixel-level annotated data in the medical domain significantly limits the training of deep learning models. In this paper, we propose a transformer-based framework for breast tumor segmentation that combines the Universal Ultrasound Foundation Model (USFM) for visual feature extraction with a CLIP text encoder for semantic guidance via cross-attention. To address the data scarcity challenge, we design a teacher-student training scheme that effectively leverages both strongly annotated data with pixel-precise masks and a substantially larger set of weakly annotated data with only bounding box labels. We aggregate a comprehensive training set of strongly annotated and weakly annotated ultrasound images from multiple public datasets, enriched by data from multiple organs. The proposed method is evaluated on the widely used BUSI and BUS UC benchmark datasets.
Keywords: Computer Vision
Full text:
Year: 2026