WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

11/16/2020
by   Hiroaki Hayashi, et al.
0

Aspect-based summarization is the task of generating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. However, due to large differences in the type of aspects for different domains (e.g., sentiment, product features), the development of previous models has tended to be domain-specific. In this paper, we propose WikiAsp, a large-scale dataset for multi-domain aspect-based summarization that attempts to spur research in the direction of open-domain aspect-based summarization. Specifically, we build the dataset using Wikipedia articles from 20 different domains, using the section titles and boundaries of each article as a proxy for aspect annotation. We propose several straightforward baseline models for this task and conduct experiments on the dataset. Results highlight key challenges that existing summarization models face in this setting, such as proper pronoun handling of quoted sources and consistent explanation of time-sensitive events.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2022

OASum: Large-Scale Open Domain Aspect-based Summarization

Aspect or query-based summarization has recently caught more attention, ...
research
08/27/2018

Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised

We present a neural framework for opinion summarization from online prod...
research
06/12/2021

Every Bite Is an Experience: Key Point Analysis of Business Reviews

Previous work on review summarization focused on measuring the sentiment...
research
08/30/2019

Exploring Domain Shift in Extractive Text Summarization

Although domain shift has been well explored in many NLP applications, i...
research
06/08/2020

Read what you need: Controllable Aspect-based Opinion Summarization of Tourist Reviews

Manually extracting relevant aspects and opinions from large volumes of ...
research
11/15/2020

Open4Business(O4B): An Open Access Dataset for Summarizing Business Documents

A major challenge in fine-tuning deep learning models for automatic summ...
research
11/06/2021

Patent Sentiment Analysis to Highlight Patent Paragraphs

Given a patent document, identifying distinct semantic annotations is an...

Please sign up or login with your details

Forgot password? Click here to reset