[Submitted on 31 Aug 2023]

Download a PDF of the paper titled “Towards Optimal Patch Size in Vision Transformers for Tumor Segmentation” by Ramtin Mojtahedi and 3 other authors: Download PDF

Abstract: Detection of tumors in metastatic colorectal cancer (mCRC) is crucial for early diagnosis and treatment of liver cancer. Convolutional neural networks (CNN) have been widely used for segmenting 3D computerized tomography (CT) scans, but they have limitations in capturing long-range dependencies and global context due to their limited kernel size. Vision transformers have been introduced to overcome this limitation, but their performance on tumor segmentation is affected by the input patch size. This paper proposes a technique to select the optimal input patch size for vision transformers based on the average volume size of metastasis lesions. The suggested framework is validated using a transfer-learning technique, which shows improved performance in terms of the Dice similarity coefficient (DSC) by pre-training the model with a larger tumor volume using the optimal patch size and then training it with a smaller one. The experiment results demonstrate consistent and improved performance on a multi-resolution metastatic colorectal cancer dataset. This study lays the foundation for optimizing the semantic segmentation of small objects using vision transformers. The implementation source code is available at: this https URL.

Submission history

From: Mohammad Hamghalam [view email]

[v1]

Thu, 31 Aug 2023 09:57:27 UTC (1,289 KB)