End-to-End Attention-based Image Captioning

04/30/2021
by   Carola Sundaramoorthy, et al.
0

In this paper, we address the problem of image captioning specifically for molecular translation where the result would be a predicted chemical notation in InChI format for a given molecular structure. Current approaches mainly follow rule-based or CNN+RNN based methodology. However, they seem to underperform on noisy images and images with small number of distinguishable features. To overcome this, we propose an end-to-end transformer model. When compared to attention-based techniques, our proposed model outperforms on molecular datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset