Co-Stack Residual Affinity Networks with Multi-level Attention Refinement for Matching Text Sequences

10/06/2018
by   Yi Tay, et al.
16

Learning a matching function between two text sequences is a long standing problem in NLP research. This task enables many potential applications such as question answering and paraphrase identification. This paper proposes Co-Stack Residual Affinity Networks (CSRAN), a new and universal neural architecture for this problem. CSRAN is a deep architecture, involving stacked (multi-layered) recurrent encoders. Stacked/Deep architectures are traditionally difficult to train, due to the inherent weaknesses such as difficulty with feature propagation and vanishing gradients. CSRAN incorporates two novel components to take advantage of the stacked architecture. Firstly, it introduces a new bidirectional alignment mechanism that learns affinity weights by fusing sequence pairs across stacked hierarchies. Secondly, it leverages a multi-level attention refinement component between stacked recurrent layers. The key intuition is that, by leveraging information across all network hierarchies, we can not only improve gradient flow but also improve overall performance. We conduct extensive experiments on six well-studied text sequence matching datasets, achieving state-of-the-art performance on all.

READ FULL TEXT
research
10/28/2017

Phase Conductor on Multi-layered Attentions for Machine Comprehension

Attention models have been intensively studied to improve NLP tasks such...
research
11/24/2018

Recurrently Controlled Recurrent Networks

Recurrent neural networks (RNNs) such as long short-term memory and gate...
research
01/03/2017

Shortcut Sequence Tagging

Deep stacked RNNs are usually hard to train. Adding shortcut connections...
research
11/16/2017

FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension

This paper introduces a new neural structure called FusionNet, which ext...
research
03/01/2023

Selectively Hard Negative Mining for Alleviating Gradient Vanishing in Image-Text Matching

Recently, a series of Image-Text Matching (ITM) methods achieve impressi...
research
09/15/2023

Make Deep Networks Shallow Again

Deep neural networks have a good success record and are thus viewed as t...
research
03/31/2023

CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

Content affinity loss including feature and pixel affinity is a main pro...

Please sign up or login with your details

Forgot password? Click here to reset