Self-validated U-GAN for Target Class Transformation in Segmentation Tasks
DOI:
https://doi.org/10.31649/1997-9266-2024-174-3-102-111Keywords:
augmentation, data generation, generative adversarial network, GAN, segmentation, deep learning, U-GAN, U-generatorAbstract
The paper addresses the problem of data scarcity for training automated intelligent systems in various specific fields such as medicine, satellite image analysis, agriculture, ecology, and language. It describes modern methods for solving this problem, including augmentation, generative adversarial networks, diffusion models, and inpainting. The focus is on the task of segmentation, where it is necessary to create masks for new objects in addition to the image. The subjective and manual process of selecting the best epoch during model training is also noted, and alternatives that can help solve this problem, such as inception score and frechet inception distance, are outlined.
An improved model of partial transformation of the target class of segmentation is proposed, which includes new self-validation components, such as an additional loss function that controls the similarity of the output image to the input one, a pretrained segmentation model, and a metric for assessing the quality of the generated masks with segmentation masks of generated images. These improvements allow the system to more effectively transform the background or zero class into the target one, create more accurate segmentation masks, and automatically select the best epochs during training.
Experiments on a dataset of panoramic tooth images showed that the use of this technology allowed increasing the accuracy of filling segmentation by 0.9%, raising the Jaccard coefficient value from 90.5% to 91.4%. The generative adversarial network model was trained for 150 epochs with automatic selection of the best epoch, which was the 135th epoch, and the quality of the generated images of this epoch was confirmed by expert evaluation. On satellite images of ships, the use of the model showed an improvement in segmentation accuracy from 63.4% to 65.2%. Despite the complexity of the data, the model was able to adequately transform the input data of the empty sea into ship objects. The best results were achieved at the 82nd epoch, which also coincided with the expert's choice of the best epoch, demonstrating the importance of automatic selection of the optimal epoch during training to eliminate additional subjective factors from this process and speed up model preparation.
These results confirm the effectiveness of the proposed approach, showing metrics improvements and better automation of the basic approach. The proposed methods and approaches have the potential for wide application in various fields, contributing to the development of new intelligent systems and increasing their accuracy.
References
H. Abdi, S. Kasaei, and M. Mehdizadeh, “Automatic segmentation of mandible in panoramic x-ray,” Journal Med. Imaging (Bellingham), vol. 2, no. 4, 044003, 2015. [Electronic resource]. Available:
https://www.academia.edu/36038975/PreProcessing_of_Dental_X-Ray_Images_Using_Adaptive_Histogram_Equalization_Method . Accessed on: May 30, 2024.
A. Samat, et al., “Mapping Blue and Red Color-Coated Steel Sheet Roof Buildings over China Using Sentinel-2A/B MSIL2A Images,” Remote Sens., no. 14, 230 p., 2022. https://doi.org/doi.org/10.3390/rs14010230 .
C. Zhang, A. Marzougui, and S. Sankaran, “High-resolution satellite imagery applications in crop phenotyping: An overview,” Computers and Electronics in Agriculture, vol. 175, pp. 105584, 2020. ISSN 0168-1699, https://doi.org/10.1016/j.compag.2020.105584 .
В. Мокін, К. Бондалєтов, Є. Крижановський, і В. Караваєв, «Метод аугментації текстів про стан масивів вод на основі інтелектуальної прив’язки до багатозв’язних геоінформаційних систем іменованих сутностей,» Вісник Вінницького політехнічного інституту, № 3, с. 55-65, 2023. https://doi.org/10.31649/1997-9266-2023-168-3-55-65 .
O. Bisikalo, O. Kovtun, and V. Kovtun, “Neural Network Concept of Ukrainian-Language Text Embedding,” 2023 13th International Conference on Advanced Computer Information Technologies (ACIT), Wrocław, Poland, 2023, pp. 566-569, https://doi.org/10.1109/ACIT58437.2023.10275511 .
C. Shorten, and T. Khoshgoftaar, “A survey on Image Data Augmentation for Deep Learning,” J Big Data, no. 6, pp.60, 2019. https://doi.org/10.1186/s40537-019-0197-0 .
Я. Ісаєнков, і О. Мокін, «Аналіз генеративних моделей глибокого навчання та особливостей їх реалізації на прикладі WGAN,» Вісник Вінницького політехнічного інституту, № 1, с. 82-94, 2022. https://doi.org/10.31649/1997-9266-2022-160-1-82-94.
M. Chen, S. Mei, J. Fan, and M. Wang, “An overview of diffusion models: Applications, guided generation, statistical rates and optimization,” arXiv preprint, arXiv: 2404.07771, April 2024. [Electronic resource]. Available: https://arxiv.org/abs/2404.07771. Accessed on: May 30, 2024.
L. Zhao, and R. Zhao, “Image Inpainting Research Based on Deep Learning,” International Journal of Advanced Network, Monitoring and Controls. no. 5. pp. 23-30, 2020. https://doi.org/10.21307/ijanmc-2020-013 .
Я. Ісаєнков, і О. Мокін, «Трансформація цільового класу для задачі сегментації з використанням U-GAN,» Вісник Вінницького політехнічного інституту, № 1, с. 81-87, 2024. https://doi.org/10.31649/1997-9266-2024-172-1-81-87 .
K. Falahkheirkhah, et al.,“Deepfake Histologic Images for Enhancing Digital Pathology,” Laboratory Investigation, vol. 103, Issue 1, pp. 100006, 2023. ISSN 0023-6837. https://doi.org/10.1016/j.labinv.2022.100006 .
R. Gulakala, B. Markert, and M. Stoffel, “Generative adversarial network based data augmentation for CNN based detection of Covid-19,” Sci Rep, vol. 12, Article no. 19186, 2022. https://doi.org/10.1038/s41598-022-23692-x .
О. В. Коменчук, і О. Б. Мокін, «Аналіз методів передоброблення панорамних стоматологічних рентгенівських знімків для задач сегментації зображень,» Вісник Вінницького політехнічного інституту, № 5, с. 41-49, 2023. https://doi.org/10.31649/1997-9266-2023-170-5-41-49 .
W. Reade, and J. Faudi, “Airbus Ship Detection Challenge,” Kaggle: Your Machine Learning and Data Science Community. [Electronic resource]. Available: https://www.kaggle.com/competitions/airbus-ship-detection . Accesed: May 30, 2024.
M. Shugaev, “Unet34 submission TTA (0.699 new public LB),” Kaggle: Your Machine Learning and Data Science Community. [Electronic resource]. Available: https://www.kaggle.com/code/iafoss/unet34-submission-tta-0-699-new-public-lb . Accesed: May 30, 2024.
Downloads
-
PDF (Українська)
Downloads: 39
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).