Liver Segmentation Using Modified CAG-SwinUNet with Explainability
DOI: 10.18287/JBPE26.12.010303
Abstract
Accurate liver segmentation from computed tomography (CT) images is crucial for clinical applications such as tumor detection and surgical planning but remains challenging due to anatomical complexity and imaging variability. Existing deep learning models, struggle with ambiguous liver boundaries, noise sensitivity, and weak feature integration across scales, leading to segmentation errors. This study introduces Cross-Attention Gate-Shifted Window U-Net (CAG-SwinUnet), an enhanced Swin-UNet variant that incorporates a Cross-Attention Gate (CAG) in skip connections to selectively refine feature fusion. Unlike traditional concatenation, CAG dynamically enhances encoder features based on decoder context, integrating residual connections and output projection to balance local and global information. Extensive evaluation on Liver Tumor Segmentation (LiTS) and Segmentation of the Liver Competition 2007 (SLIVER07) demonstrates state-of-the-art performance, achieving 97.75% Dice Similarity Coefficient (DSC) and 2.40 mm Hausdorff Distance (HD) on LiTS, and 96.65% DSC and 3.10 mm HD on SLIVER07, respectively. To enhance explainability, gradient-weighted class activation mapping, provide visual insights into the model’s decision-making process, ensuring transparency and reliability in liver segmentation.
Keywords
Full Text:
PDFReferences
1. W van Elmpt, G Landry, “Quantitative computed tomography in radiation therapy: A mature technology with a bright future,” Physics and Imaging in Radiation Oncology 6, 12–13 (2018).
2. S. S. Kumar, R. S. Moni, and J. Rajeesh, “An automatic computer-aided diagnosis system for liver tumours on computed tomography images,” Computers & Electrical Engineering 39(5), 1516–1526 (2013).
3. A. H. Foruzan, R. Aghaeizadeh Zoroofi, M. Hori, and Y. Sato, “Liver segmentation by intensity analysis and anatomical information in multi-slice CT images,” International Journal of Computer Assisted Radiology and Surgery 4(3), 287–297 (2009).
4. S. S. Kumar, R. S. Moni, and J. Rajeesh, “Automatic liver and lesion segmentation: a primary step in diagnosis of liver diseases,” Signal, Image and Video Processing 7(1), 163–172 (2013).
5. C. Li, X. Wang, S. Eberl, M. Fulham, Y. Yin, J. Chen, and D. D. Feng, “A Likelihood and Local Constraint Level Set Model for Liver Tumor Segmentation from CT Volumes,” IEEE Transactions on Biomedical Engineering 60(10), 2967–2977 (2013).
6. A. Saito, S. Nawano, and A. Shimizu, “Joint optimization of segmentation and shape prior from level-set-based statistical shape model, and its application to the automated segmentation of abdominal organs,” Medical Image Analysis 28, 46–65 (2016).
7. L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamaría, M. A. Fadhel, M. Al-Amidie, and L. Farhan, “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions,” Journal of Big Data 8(1), 53 (2021).
8. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi (Eds.), Springer International Publishing, 9351, 234–241 (2015).
9. Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: A Nested U-Net Architecture for Medical Image Segmentation,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, D. Stoyanov, Z. Taylor, G. Carneiro, T. Syeda-Mahmood, A. Martel, L. Maier-Hein, J. M. R. S. Tavares, A. Bradley, J. P. Papa, V. Belagiannis, J. C. Nascimento, Z. Lu, S. Conjeti, M. Moradi, H. Greenspan, and A. Madabhushi (Eds.), Springer International Publishing, 11045, 3–11 (2018).
10. R. M. Prakash, M. Vimala, V. Srilekha, P. Krishnaleela, and S. Thayammal, “UNet with Attention Mechanism for Segmentation of Liver from Abdominal CT Images,” in 2024 Third International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), IEEE (2024).
11. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” arXiv preprint arXiv:2010.11929 (2020).
12. S. S. Kumar, “Advancements in medical image segmentation: A review of transformer models,” Computers and Electrical Engineering 123, 110099 (2025).
13. H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation,” in Computer Vision – ECCV 2022 Workshops, L. Karlinsky, T. Michaeli, and K. Nishino (Eds.), Springer Nature Switzerland, 13803, 205–218 (2023).
14. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, 9992–10002 (2021).
15. S. Ali, T. Abuhmed, S. El-Sappagh, K. Muhammad, J. M. Alonso-Moral, R. Confalonieri, R. Guidotti, J. Del Ser, N. Díaz-Rodríguez, and F. Herrera, “Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence,” Information Fusion 99, 101805 (2023).
16. P. Das, A. Ortega, “Gradient-Weighted Class Activation Mapping for Spatio Temporal Graph Convolutional Network,” in ICASSP 2022 – 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 4043–4047 (2022).
17. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. C. Courville, and Y. Bengio, “Generative Adversarial Nets,” Advances in Neural Information Processing Systems 27, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Q. Weinberger (Eds.), 2672–2680 (2014).
18. C. Wei, S. Ren, K. Guo, H. Hu, and J. Liang, “High-Resolution Swin Transformer for Automatic Medical Image Segmentation,” Sensors 23(7), 3420 (2023).
19. A. Hatamizadeh, Y. Tang, V. Nath, D. Yang, A. Myronenko, B. Landman, H. R. Roth, and D. Xu, “UNETR: Transformers for 3D Medical Image Segmentation,” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 574–584 (2021).
20. H. Qi, W. Wang, Y. Shi, and X. Wang, “AD-DUNet: A dual-branch encoder approach by combining axial Transformer with cascaded dilated convolutions for liver and hepatic tumor segmentation,” Biomedical Signal Processing and Control 95, 106397 (2024).
21. J. Ou, L. Jiang, T. Bai, P. Zhan, R. Liu, and H. Xiao, “ResTransUnet: An effective network combined with Transformer and U-Net for liver segmentation in CT scans,” Computers in Biology and Medicine 177, 108625 (2024).
22. W. Ren, B. Li, H. Peng, and J. Wang, “Lgma-net: liver and tumor segmentation methods based on local–global feature mergence and attention mechanisms,” Signal, Image and Video Processing 19(1), 43 (2025).
23. R. He, S. Xu, Y. Liu, Q. Li, Y. Liu, N. Zhao, Y. Yuan, and H. Zhang, “Three-Dimensional Liver Image Segmentation Using Generative Adversarial Networks Based on Feature Restoration,” Frontiers in Medicine 8, 794969 (2022).
24. X. Wei, X. Chen, C. Lai, Y. Zhu, H. Yang, and Y. Du, “Automatic Liver Segmentation in CT Images with Enhanced GAN and Mask Region-Based CNN Architectures,” BioMed Research International 2021(1), 9956983 (2021).
25. P. Lv, J. Wang, X. Zhang, C. Ji, L. Zhou, and H. Wang, “An improved residual U-Net with morphological-based loss function for automatic liver segmentation in computed tomography,” Mathematical Biosciences and Engineering 19(2), 1426–1447 (2022).
26. R. V. Manjunath, K. Kwadiki, “Automatic liver and tumour segmentation from CT images using Deep learning algorithm,” Results in Control and Optimization 6, 100087 (2022).
27. L. Li, H. Ma, “RDCTrans U-Net: A Hybrid Variable Architecture for Liver CT Image Segmentation,” Sensors 22(7), 2452 (2022).
28. Y. Chen, C. Zheng, T. Zhou, L. Feng, L. Liu, Q. Zeng, and G. Wang, “A deep residual attention-based U-Net with a biplane joint method for liver segmentation from CT scans,” Computers in Biology and Medicine 152, 106421 (2023).
29. H. Liu, Y. Fu, S. Zhang, J. Liu, Y. Wang, G. Wang, and J. Fang, “GCHA-Net: Global context and hybrid attention network for automatic liver segmentation,” Computers in Biology and Medicine 152, 106352 (2023).
30. J. Wang, P. Lv, H. Wang, and C. Shi, “SAR-U-Net: Squeeze-and-excitation block and atrous spatial pyramid pooling based residual U-Net for automatic liver segmentation in Computed Tomography,” Computer Methods and Programs in Biomedicine 208, 106268 (2021).
31. J. Li, K. Liu, Y. Hu, H. Zhang, A. A. Heidari, H. Chen, W. Zhang, A. D. Algarni, and H. Elmannai, “Eres-UNet++: Liver CT image segmentation based on high-efficiency channel attention and Res-UNet++,” Computers in Biology and Medicine 158, 106501 (2023).
32. D. T. Kushnure, S. Tyagi, and S. N. Talbar, “LiM-Net: Lightweight multi-level multiscale network with deep residual learning for automatic liver segmentation in CT images,” Biomedical Signal Processing and Control 80, 104305 (2023).
33. Y. Cao, Y. Cheng, “SACU-Net: Shape-Aware U-Net for Biomedical Image Segmentation With Attention Mechanism and Context Extraction,” IEEE Access 13, 5719–5730 (2025).
34. L. Tinglan, Q. Jun, Q. Guihe, S. Weili, and Z. Wentao, “Liver segmentation network based on detail enhancement and multi-scale feature fusion,” Scientific Reports 15(1), 683 (2025).
35. O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y. Hammerla, B. Kainz, B. Glocker, and D. Rueckert, “Attention U-Net: Learning Where to Look for the Pancreas,” arXiv preprint arXiv:1804.03999 (2018).
36. J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation,” arXiv preprint arXiv:2102.04306 (2021).
37. X. Jia, S. Jian, Y. Tan, Y. Che, W. Chen, and Z. Liang, “Gated Cross-Attention Network for Depth Completion,” in ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 1–5 (2025).
38. P. Bilic, P. Christ, H. B. Li, et al., “The Liver Tumor Segmentation Benchmark (LiTS),” Medical Image Analysis 84, 102680 (2023).
39. J. Sun, Z. Hui, C. Tang, and X. Wu, “Liver segmentation based on complementary features U-Net,” The Visual Computer 39(10), 4685–4696 (2023).
40. N. Ibtehaz, M. S. Rahman, “MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation,” Neural Networks 121, 74–87 (2020).
41. T. Lei, R. Wang, Y. Zhang, Y. Wan, C. Liu, and A. K. Nandi, “DefED-Net: Deformable Encoder-Decoder Network for Liver and Liver Tumor Segmentation,” IEEE Transactions on Radiation and Plasma Medical Sciences 6(1), 68–78 (2022).
Сontact
34 Moskovskoe shosse, Samara, 443086, Russian Federation
Email: j-bpe@ssau.ru
Phone: +7-846-267-4550
© 2014-2025 J-BPE















