PRISM: A Unified Framework of Parameterized Submodular Information Measures for Targeted Data Subset Selection and Summarization

02/27/2021
by   Vishal Kaushal, et al.
0

With increasing data, techniques for finding smaller, yet effective subsets with specific characteristics become important. Motivated by this, we present PRISM, a rich class of Parameterized Submodular Information Measures, that can be used in applications where such targeted subsets are desired. We demonstrate the utility of PRISM in two such applications. First, we apply PRISM to improve a supervised model's performance at a given additional labeling cost by targeted subset selection (PRISM-TSS) where a subset of unlabeled points matching a target set are added to the training set. We show that PRISM-TSS generalizes and is connected to several existing approaches to targeted data subset selection. Second, we apply PRISM to a more nuanced targeted summarization (PRISM-TSUM) where data (e.g., image collections, text or videos) is summarized for quicker human consumption with additional user intent. PRISM-TSUM handles multiple flavors of targeted summarization such as query-focused, topic-irrelevant, privacy-preserving and update summarization in a unified way. We show that PRISM-TSUM also generalizes and unifies several existing past work on targeted summarization. Through extensive experiments on image classification and image-collection summarization we empirically verify the superiority of PRISM-TSS and PRISM-TSUM over the state-of-the-art.

READ FULL TEXT
research
10/12/2020

A Unified Framework for Generic, Query-Focused, Privacy Preserving and Update Summarization using Submodular Information Measures

We study submodular information measures as a rich framework for generic...
research
04/30/2021

Submodular Mutual Information for Targeted Data Subset Selection

With the rapid growth of data, it is becoming increasingly difficult to ...
research
02/22/2022

Submodlib: A Submodular Optimization Library

Submodular functions are a special class of set functions which naturall...
research
09/24/2018

Vis-DSS: An Open-Source toolkit for Visual Data Selection and Summarization

With increasing amounts of visual data being created in the form of vide...
research
06/17/2016

Query-Focused Opinion Summarization for User-Generated Content

We present a submodular function-based framework for query-focused opini...
research
04/13/2023

Beyond Submodularity: A Unified Framework of Randomized Set Selection with Group Fairness Constraints

Machine learning algorithms play an important role in a variety of impor...
research
10/24/2022

Controlled Text Reduction

Producing a reduced version of a source text, as in generic or focused s...

Please sign up or login with your details

Forgot password? Click here to reset