Finding Influential Instances for Distantly Supervised Relation Extraction
Distant supervision has been demonstrated to be highly beneficial to enhance relation extraction models, but it often suffers from high label noise. In this work, we propose a novel model-agnostic instance subsampling method for distantly supervised relation extraction, namely REIF, which bridges the gap of realizing influence subsampling in deep learning. It encompasses two key steps: first calculating instance-level influences that measure how much each training instance contributes to the validation loss change of our model, then deriving sampling probabilities via the proposed sigmoid sampling function to perform batch-in-bag sampling. We design a fast influence subsampling scheme that reduces the computational complexity from O(mn) to O(1), and analyze its robustness when the sigmoid sampling function is employed. Empirical experiments demonstrate our method's superiority over the baselines, and its ability to support interpretable instance selection.
READ FULL TEXT