Using Deep Neural Networks to Translate Multi-lingual Threat Intelligence

07/19/2018
by   Priyanka Ranade, et al.
0

The multilingual nature of the Internet increases complications in the cybersecurity community's ongoing efforts to strategically mine threat intelligence from OSINT data on the web. OSINT sources such as social media, blogs, and dark web vulnerability markets exist in diverse languages and hinder security analysts, who are unable to draw conclusions from intelligence in languages they don't understand. Although third party translation engines are growing stronger, they are unsuited for private security environments. First, sensitive intelligence is not a permitted input to third party engines due to privacy and confidentiality policies. In addition, third party engines produce generalized translations that tend to lack exclusive cybersecurity terminology. In this paper, we address these issues and describe our system that enables threat intelligence understanding across unfamiliar languages. We create a neural network based system that takes in cybersecurity data in a different language and outputs the respective English translation. The English translation can then be understood by an analyst, and can also serve as input to an AI based cyber-defense system that can take mitigative action. As a proof of concept, we have created a pipeline which takes Russian threats and generates its corresponding English, RDF, and vectorized representations. Our network optimizes translations on specifically, cybersecurity data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2022

Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish

We develop machine translation and speech synthesis systems to complemen...
research
05/07/2019

Cyber-All-Intel: An AI for Security related Threat Intelligence

Keeping up with threat intelligence is a must for a security analyst tod...
research
08/16/2021

Generating Cyber Threat Intelligence to Discover Potential Security Threats Using Classification and Topic Modeling

Due to the variety of cyber-attacks or threats, the cybersecurity commun...
research
05/18/2023

Multi-CrossRE A Multi-Lingual Multi-Domain Dataset for Relation Extraction

Most research in Relation Extraction (RE) involves the English language,...
research
06/13/2022

Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations

Due to the computational cost of running inference for a neural network,...
research
05/22/2023

Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil Demographic Biases in Languages at Scale

We introduce a multilingual extension of the HOLISTICBIAS dataset, the l...
research
09/14/2021

A Crawler Architecture for Harvesting the Clear, Social, and Dark Web for IoT-Related Cyber-Threat Intelligence

The clear, social, and dark web have lately been identified as rich sour...

Please sign up or login with your details

Forgot password? Click here to reset