Enhancing Medical Image Classification with Diffusion-Based Synthetic Data

Samuel Bohumel

Supervisor(s): Ing. Maroš Kollár

Slovak Technical University


Abstract: Neural networks and deep learning are now widely used approaches for solving tasks in multiple domains, includ- ing computer vision. In the field of medical image pro- cessing, these approaches can bring efficient and fast di- agnosis. However, there is a challenge associated with the lack of annotated training data needed to train the models. The collection and especially the annotation of such data can be time-consuming and expensive. In this work, we explore the use of generative models for medical data syn- thesis that could effectively complement existing training sets and improve the performance of classification models. The main area of our research is the synthesis of atypical cells, which is the main signal of a tumor. Nuclear atypia is usually manifested by enlarged cells and irregular shapes, which are the features we focus on. We take advantage of diffusion probabilistic models that are used for guided synthesis of samples either from a segmentation mask or an atypia class. This research contributes to the integration of machine learning techniques in healthcare and evaluates the presence of synthetic data in training sets.
Keywords: Computer Vision, Image Processing
Full text:
Year: 2025