Event Schema Induction using Tensor Factorization with Back-off
The goal of Event Schema Induction(ESI) is to identify schemas of events from a corpus of documents. For example, given documents from the sports domain, we would like to infer that win(WinningPlayer, Trophy, OpponentPlayer, Location) is an important event schema for this domain. Automatic discovery of such event schemas is an important first step towards building domain-specific Knowledge Graphs (KGs). ESI has been the focus of some prior research, with generative models achieving the best performance. In this paper,we propose TFB, a tensor factorization-based method with back-off for ESI. TFB solves a novel objective to factorize Open Information Extraction (OpenIE) tuples for inducing binary schemas. Event schemas are induced out of this set of binary schemas by solving a constrained clique problem. To the best of our knowledge this is the first application of tensor factorization for the ESI problem. TFB outperforms current state-of-the-art by 52 (absolute) points gain in accuracy, while achieving 90x speedup on average. We hope to make all the code and datasets used in the paper publicly available upon publication of the paper.
READ FULL TEXT