Landmark Detection using Transformer Toward Robot-assisted Nasal Airway Intubation

by   Tianhang Liu, et al.
The Chinese University of Hong Kong

Robot-assisted airway intubation application needs high accuracy in locating targets and organs. Two vital landmarks, nostrils and glottis, can be detected during the intubation to accommodate the stages of nasal intubation. Automated landmark detection can provide accurate localization and quantitative evaluation. The Detection Transformer (DeTR) leads object detectors to a new paradigm with long-range dependence. However, current DeTR requires long iterations to converge, and does not perform well in detecting small objects. This paper proposes a transformer-based landmark detection solution with deformable DeTR and the semantic-aligned-matching module for detecting landmarks in robot-assisted intubation. The semantics aligner can effectively align the semantics of object queries and image features in the same embedding space using the most discriminative features. To evaluate the performance of our solution, we utilize a publicly accessible glottis dataset and automatically annotate a nostril detection dataset. The experimental results demonstrate our competitive performance in detection accuracy. Our code is publicly accessible.


page 4

page 6


DATR: Domain-adaptive transformer for multi-domain landmark detection

Accurate anatomical landmark detection plays an increasingly vital role ...

LineMarkNet: Line Landmark Detection for Valet Parking

We aim for accurate and efficient line landmark detection for valet park...

Accelerating DETR Convergence via Semantic-Aligned Matching

The recently developed DEtection TRansformer (DETR) establishes a new ob...

Feasibility of Remote Landmark Identification for Cricothyrotomy Using Robotic Palpation

Cricothyrotomy is a life-saving emergency intervention that secures an a...

CEPHA29: Automatic Cephalometric Landmark Detection Challenge 2023

Quantitative cephalometric analysis is the most widely used clinical and...

Semantic-Aligned Matching for Enhanced DETR Convergence and Multi-Scale Feature Fusion

The recently proposed DEtection TRansformer (DETR) has established a ful...

Modelling Lips-State Detection Using CNN for Non-Verbal Communications

Vision-based deep learning models can be promising for speech-and-hearin...

Please sign up or login with your details

Forgot password? Click here to reset