Regular Expressions for Fast-response COVID-19 Text Classification

02/18/2021
by   Igor L. Markov, et al.
0

Text classifiers are at the core of many NLP applications and use a variety of algorithmic approaches and software. This paper describes how Facebook determines if a given piece of text - anything from a hashtag to a post - belongs to a narrow topic such as COVID-19. To fully define a topic and evaluate classifier performance we employ human-guided iterations of keyword discovery, but do not require labeled data. For COVID-19, we build two sets of regular expressions: (1) for 66 languages, with 99 (2) for the 11 most common languages, with precision >90 Regular expressions enable low-latency queries from multiple platforms. Response to challenges like COVID-19 is fast and so are revisions. Comparisons to a DNN classifier show explainable results, higher precision and recall, and less overfitting. Our learnings can be applied to other narrow-topic classifiers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2020

Data-Driven Regular Expressions Evolution for Medical Text Classification Using Genetic Programming

In medical fields, text classification is one of the most important task...
research
08/14/2023

Regular Expressions in a CS Formal Languages Course

Regular expressions in an Automata Theory and Formal Languages course ar...
research
06/19/2017

Topic Modeling for Classification of Clinical Reports

Electronic health records (EHRs) contain important clinical information ...
research
11/16/2020

Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

In this paper, we propose a rule-based engine composed of high quality a...
research
07/12/2023

A Program That Simplifies Regular Expressions (Tool paper)

This paper presents the main features of a system that aims to transform...
research
07/22/2021

Reproducibility of COVID-19 pre-prints

To examine the reproducibility of COVID-19 research, we create a dataset...

Please sign up or login with your details

Forgot password? Click here to reset