Very Large Language Model as a Unified Methodology of Text Mining

12/19/2022
by   Meng Jiang, et al.
0

Text data mining is the process of deriving essential information from language text. Typical text mining tasks include text categorization, text clustering, topic modeling, information extraction, and text summarization. Various data sets are collected and various algorithms are designed for the different types of tasks. In this paper, I present a blue sky idea that very large language model (VLLM) will become an effective unified methodology of text mining. I discuss at least three advantages of this new methodology against conventional methods. Finally I discuss the challenges in the design and development of VLLM techniques for text mining.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2012

Pbm: A new dataset for blog mining

Text mining is becoming vital as Web 2.0 offers collaborative content cr...
research
07/10/2017

A Brief Survey of Text Mining: Classification, Clustering and Extraction Techniques

The amount of text that is generated every day is increasing dramaticall...
research
06/20/2023

ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis

We use prompt engineering to guide ChatGPT in the automation of text min...
research
02/07/2016

Scalable Text Mining with Sparse Generative Models

The information age has brought a deluge of data. Much of this is in tex...
research
01/03/2023

ClusTop: An unsupervised and integrated text clustering and topic extraction framework

Text clustering and topic extraction are two important tasks in text min...
research
08/01/2022

Data Collection and Analysis of French Dialects

This paper discusses creating and analysing a new dataset for data minin...
research
04/08/2015

Mining and discovering biographical information in Difangzhi with a language-model-based approach

We present results of expanding the contents of the China Biographical D...

Please sign up or login with your details

Forgot password? Click here to reset