Linear tSNE optimization for the Web

by   Nicola Pezzotti, et al.

The t-distributed Stochastic Neighbor Embedding (tSNE) algorithm has become in recent years one of the most used and insightful techniques for the exploratory data analysis of high-dimensional data. tSNE reveals clusters of high-dimensional data points at different scales while it requires only minimal tuning of its parameters. Despite these advantages, the computational complexity of the algorithm limits its application to relatively small datasets. To address this problem, several evolutions of tSNE have been developed in recent years, mainly focusing on the scalability of the similarity computations between data points. However, these contributions are insufficient to achieve interactive rates when visualizing the evolution of the tSNE embedding for large datasets. In this work, we present a novel approach to the minimization of the tSNE objective function that heavily relies on modern graphics hardware and has linear computational complexity. Our technique does not only beat the state of the art, but can even be executed on the client side in a browser. We propose to approximate the repulsion forces between data points using adaptive-resolution textures that are drawn at every iteration with WebGL. This approximation allows us to reformulate the tSNE minimization problem as a series of tensor operation that are computed with TensorFlow.js, a JavaScript library for scalable tensor computations.


page 2

page 6


Compressive Embedding and Visualization using Graphs

Visualizing high-dimensional data has been a focus in data analysis comm...

Opening the black-box of Neighbor Embedding with Hotelling's T2 statistic and Q-residuals

In contrast to classical techniques for exploratory analysis of high-dim...

Do Subsampled Newton Methods Work for High-Dimensional Data?

Subsampled Newton methods approximate Hessian matrices through subsampli...

Efficient Algorithms for t-distributed Stochastic Neighborhood Embedding

t-distributed Stochastic Neighborhood Embedding (t-SNE) is a method for ...

A fast multilevel dimension iteration algorithm for high dimensional numerical integration

In this paper, we propose and study a fast multilevel dimension iteratio...

Learning Networked Exponential Families with Network Lasso

The data arising in many important big-data applications, ranging from s...

Scaling Active Search using Linear Similarity Functions

Active Search has become an increasingly useful tool in information retr...

Please sign up or login with your details

Forgot password? Click here to reset