Approach to Predicting News – A Precise Multi-LSTM Network With BERT

04/26/2022
by   Chia-Lin Chen, et al.
0

Varieties of Democracy (V-Dem) is a new approach to conceptualizing and measuring democracy and politics. It has information for 200 countries and is one of the biggest databases for political science. According to the V-Dem annual democracy report 2019, Taiwan is one of the two countries that got disseminated false information from foreign governments the most. It also shows that the "made-up news" has caused a great deal of confusion in Taiwanese society and has serious impacts on global stability. Although there are several applications helping distinguish the false information, we found out that the pre-processing of categorizing the news is still done by human labor. However, human labor may cause mistakes and cannot work for a long time. The growing demands for automatic machines in the near decades show that while the machine can do as good as humans or even better, using machines can reduce humans' burden and cut down costs. Therefore, in this work, we build a predictive model to classify the category of news. The corpora we used contains 28358 news and 200 news scraped from the online newspaper Liberty Times Net (LTN) website and includes 8 categories: Technology, Entertainment, Fashion, Politics, Sports, International, Finance, and Health. At first, we use Bidirectional Encoder Representations from Transformers (BERT) for word embeddings which transform each Chinese character into a (1,768) vector. Then, we use a Long Short-Term Memory (LSTM) layer to transform word embeddings into sentence embeddings and add another LSTM layer to transform them into document embeddings. Each document embedding is an input for the final predicting model, which contains two Dense layers and one Activation layer. And each document embedding is transformed into 1 vector with 8 real numbers, then the highest one will correspond to the 8 news categories with up to 99

READ FULL TEXT

page 6

page 7

research
06/01/2022

A Multi-Policy Framework for Deep Learning-Based Fake News Detection

Connectivity plays an ever-increasing role in modern society, with peopl...
research
04/03/2019

Evaluating KGR10 Polish word embeddings in the recognition of temporal expressions using BiLSTM-CRF

The article introduces a new set of Polish word embeddings, built using ...
research
07/19/2021

Stock Movement Prediction with Financial News using Contextualized Embedding from BERT

News events can greatly influence equity markets. In this paper, we are ...
research
07/08/2016

Actionable and Political Text Classification using Word Embeddings and LSTM

In this work, we apply word embeddings and neural networks with Long Sho...
research
07/21/2017

Shallow reading with Deep Learning: Predicting popularity of online content using only its title

With the ever decreasing attention span of contemporary Internet users, ...
research
10/02/2019

NASS-AI: Towards Digitization of Parliamentary Bills using Document Level Embedding and Bidirectional Long Short-Term Memory

There has been several reports in the Nigerian and International media a...
research
08/02/2018

SWDE : A Sub-Word And Document Embedding Based Engine for Clickbait Detection

In order to expand their reach and increase website ad revenue, media ou...

Please sign up or login with your details

Forgot password? Click here to reset