Co-Learning Feature Fusion Maps from PET-CT Images of Lung Cancer
The analysis of multi-modality positron emission tomography and computed tomography (PET-CT) images requires combining the sensitivity of PET to detect abnormal regions with anatomical localization from CT. However, current methods for PET-CT image analysis either process the modalities separately or fuse information from each modality based on knowledge about the image analysis task. These methods generally do not consider the spatially varying visual characteristics that encode different information across the different modalities, which have different priorities at different locations. For example, a high abnormal PET uptake in the lungs is more meaningful for tumor detection than physiological PET uptake in the heart. Our aim is to improve fusion of the complementary information in multi-modality PET-CT with a new supervised convolutional neural network (CNN) that learns to fuse complementary information for multi-modality medical image analysis. Our CNN first encodes modality-specific features and then uses them to derive a spatially varying fusion map that quantifies the relative importance of each modality's features across different spatial locations. These fusion maps are then multiplied with the modality-specific feature maps to obtain a representation of the complementary multi-modality information at different locations, which can then be used for image analysis, e.g. region detection. We evaluated our CNN on a region detection problem using a dataset of PET-CT images of lung cancer. We compared our method to baseline techniques for multi-modality image analysis (pre-fused inputs, multi-branch techniques, multi-channel techniques) and demonstrated that our approach had a significantly higher accuracy (p < 0.05) than the baselines.
READ FULL TEXT