DeepAI AI Chat
Log In Sign Up

Spatiotemporal Feature Learning Based on Two-Step LSTM and Transformer for CT Scans

by   Chih-Chung Hsu, et al.

Computed tomography (CT) imaging could be very practical for diagnosing various diseases. However, the nature of the CT images is even more diverse since the resolution and number of the slices of a CT scan are determined by the machine and its settings. Conventional deep learning models are hard to tickle such diverse data since the essential requirement of the deep neural network is the consistent shape of the input data. In this paper, we propose a novel, effective, two-step-wise approach to tickle this issue for COVID-19 symptom classification thoroughly. First, the semantic feature embedding of each slice for a CT scan is extracted by conventional backbone networks. Then, we proposed a long short-term memory (LSTM) and Transformer-based sub-network to deal with temporal feature learning, leading to spatiotemporal feature representation learning. In this fashion, the proposed two-step LSTM model could prevent overfitting, as well as increase performance. Comprehensive experiments reveal that the proposed two-step method not only shows excellent performance but also could be compensated for each other. More specifically, the two-step LSTM model has a lower false-negative rate, while the 2-step Swin model has a lower false-positive rate. In summary, it is suggested that the model ensemble could be adopted for more stable and promising performance in real-world applications.


page 1

page 2

page 3

page 4


Visual Transformer with Statistical Test for COVID-19 Classification

With the massive damage in the world caused by Coronavirus Disease 2019 ...

Strong Baseline and Bag of Tricks for COVID-19 Detection of CT Scans

This paper investigates the application of deep learning models for lung...

A CNN-LSTM Architecture for Detection of Intracranial Hemorrhage on CT scans

We propose a novel method that combines a convolutional neural network (...

MARL: Multimodal Attentional Representation Learning for Disease Prediction

Existing learning models often utilise CT-scan images to predict lung di...

Scopeformer: n-CNN-ViT Hybrid Model for Intracranial Hemorrhage Classification

We propose a feature generator backbone composed of an ensemble of convo...