Skip to main navigation menu Skip to main content Skip to site footer

Transformer-Based Structure-Aware Lung Cancer Image Segmentation with Multi-Scale Fusion and Boundary-Guided Prediction

Abstract

This paper addresses common challenges in lung cancer medical images, including blurred boundaries, large variations in target size, and complex anatomical structures. A structure-aware semantic segmentation method is proposed based on an improved Mask2Former framework. The model adopts a Transformer-based unified architecture and introduces a multi-scale feature fusion mechanism and cross-scale query attention module. These components enhance the joint modeling of local details and global semantics. In the feature encoding stage, positional embeddings are incorporated to improve structural awareness. In the mask prediction stage, a boundary-guided module is used to strengthen contour recognition. A series of sensitivity experiments is conducted under various settings, including learning rate adjustment, resolution changes, noise perturbation, and data imbalance. The model's stability and robustness are systematically evaluated across different training conditions and input environments. Experimental results show that the proposed method achieves excellent performance on multiple mainstream metrics. It demonstrates strong generalization and structural representation capabilities, providing effective technical support for automated lung cancer segmentation and validating the modeling value of structure-aware mechanisms in medical imaging tasks.

pdf