Bounding Entities within Dense Subtensors

10/15/2018
by   Yikun Ban, et al.
0

Group-based fraud detection is a promising methodology to catch frauds on the Internet because 1) it does not require a long activity history for a single user; and 2) it is difficult for fraudsters to avoid due to their economic constraints. Unfortunately, existing work does not cover the entire picture of a fraud group: they either focus on the grouping feature based on graph features like edge density, or probability-based features, but not both. To our knowledge, we are the first to combine these features into a single set of metrics: the complicity score and fraud density score. Both scores allow customization to accommodate different data types and data distributions. Even better, algorithms built around these metrics only use localized graph features, and thus scale easily on modern big data frameworks. We have applied our algorithms to a real production dataset and achieve state-of-the-art results comparing to other existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/25/2018

BadLink: Combining Graph and Information-Theoretical Features for Online Fraud Group Detection

Frauds severely hurt many kinds of Internet businesses. Group-based frau...
research
04/07/2019

Adaptive NMS: Refining Pedestrian Detection in a Crowd

Pedestrian detection in a crowd is a very challenging issue. This paper ...
research
04/07/2020

pAElla: Edge-AI based Real-Time Malware Detection in Data Centers

The increasing use of Internet-of-Things (IoT) devices for monitoring a ...
research
04/19/2021

A Novel Interaction-based Methodology Towards Explainable AI with Better Understanding of Pneumonia Chest X-ray Images

In the field of eXplainable AI (XAI), robust “blackbox” algorithms such ...
research
05/02/2023

On the selection of optimal subdata for big data regression based on leverage scores

Regression can be really difficult in case of big datasets, since we hav...
research
08/15/2019

Analyzing the Fine Structure of Distributions

One aim of data mining is the identification of interesting structures i...
research
04/01/2018

Adaptive Group Shuffled Decoding for LDPC Codes

We propose new grouping methods for group shuffled (GS) decoding of both...

Please sign up or login with your details

Forgot password? Click here to reset