Temporal Correlation of Internet Observatories and Outposts

03/19/2022
by   Jeremy Kepner, et al.
0

The Internet has become a critical component of modern civilization requiring scientific exploration akin to endeavors to understand the land, sea, air, and space environments. Understanding the baseline statistical distributions of traffic are essential to the scientific understanding of the Internet. Correlating data from different Internet observatories and outposts can be a useful tool for gaining insights into these distributions. This work compares observed sources from the largest Internet telescope (the CAIDA darknet telescope) with those from a commercial outpost (the GreyNoise honeyfarm). Neither of these locations actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Newly developed GraphBLAS hyperspace matrices and D4M associative array technologies enable the efficient analysis of these data on significant scales. The CAIDA sources are well approximated by a Zipf-Mandelbrot distribution. Over a 6-month period 70% of the brightest (highest frequency) sources in the CAIDA telescope are consistently detected by coeval observations in the GreyNoise honeyfarm. This overlap drops as the sources dim (reduce frequency) and as the time difference between the observations grows. The probability of seeing a CAIDA source is proportional to the logarithm of the brightness. The temporal correlations are well described by a modified Cauchy distribution. These observations are consistent with a correlated high frequency beam of sources that drifts on a time scale of a month.

READ FULL TEXT
research
08/15/2021

Spatial Temporal Analysis of 40,000,000,000,000 Internet Darkspace Packets

The Internet has never been more important to our society, and understan...
research
04/08/2019

New Phenomena in Large-Scale Internet Traffic

The Internet is transforming our society, necessitating a quantitative u...
research
11/05/2020

Stochastic Approximation for High-frequency Observations in Data Assimilation

With the increasing penetration of high-frequency sensors across a numbe...
research
05/09/2021

Estimating the Causal Effects of Cruise Traffic on Air Pollution using Randomization-Based Inference

Local environmental organizations and media have recently expressed conc...
research
01/11/2021

Evolutionary Map of the Universe (EMU):Compact radio sources in the SCORPIO field towards the Galactic plane

We present observations of a region of the Galactic plane taken during t...
research
05/20/2020

A Parallelizable Method for Missing Internet Traffic Tensor Data

Recovery of internet network traffic data from incomplete observed data ...
research
05/15/2020

Watching the Watchers: Nonce-based Inverse Surveillance to Remotely Detect Monitoring

Internet users and service providers do not often know when traffic is b...

Please sign up or login with your details

Forgot password? Click here to reset