Towards automatic extractive text summarization of A-133 Single Audit reports with machine learning

by   Vivian T. Chou, et al.

The rapid growth of text data has motivated the development of machine-learning based automatic text summarization strategies that concisely capture the essential ideas in a larger text. This study aimed to devise an extractive summarization method for A-133 Single Audits, which assess if recipients of federal grants are compliant with program requirements for use of federal funding. Currently, these voluminous audits must be manually analyzed by officials for oversight, risk management, and prioritization purposes. Automated summarization has the potential to streamline these processes. Analysis focused on the "Findings" section of  20,000 Single Audits spanning 2016-2018. Following text preprocessing and GloVe embedding, sentence-level k-means clustering was performed to partition sentences by topic and to establish the importance of each sentence. For each audit, key summary sentences were extracted by proximity to cluster centroids. Summaries were judged by non-expert human evaluation and compared to human-generated summaries using the ROUGE metric. Though the goal was to fully automate summarization of A-133 audits, human input was required at various stages due to large variability in audit writing style, content, and context. Examples of human inputs include the number of clusters, the choice to keep or discard certain clusters based on their content relevance, and the definition of a top sentence. Overall, this approach made progress towards automated extractive summaries of A-133 audits, with future work to focus on full automation and improving summary consistency. This work highlights the inherent difficulty and subjective nature of automated summarization in a real-world application.


page 1

page 2

page 3

page 4


The Rule of Three: Abstractive Text Summarization in Three Bullet Points

Neural network-based approaches have become widespread for abstractive t...

Automatic Text Summarization Methods: A Comprehensive Review

One of the most pressing issues that have arisen due to the rapid growth...

Beyond Text Generation: Supporting Writers with Continuous Automatic Text Summaries

We propose a text editor to help users plan, structure and reflect on th...

Automatic Summarization of Online Debates

Debate summarization is one of the novel and challenging research areas ...

What Makes a Good Summary? Reconsidering the Focus of Automatic Summarization

Automatic text summarization has enjoyed great progress over the last ye...

GEMINI: Controlling the Sentence-level Writing Style for Abstractive Text Summarization

Human experts write summaries using different techniques, including rewr...

A General Contextualized Rewriting Framework for Text Summarization

The rewriting method for text summarization combines extractive and abst...

Please sign up or login with your details

Forgot password? Click here to reset