Dynamic Parameter Allocation in Parameter Servers

02/03/2020
by   Alexander Renz-Wieland, et al.
0

To keep up with increasing dataset sizes and model complexity, distributed training has become a necessity for large machine learning tasks. Parameter servers ease the implementation of distributed parameter management—a key concern in distributed training—, but can induce severe communication overhead. To reduce communication overhead, distributed machine learning algorithms use techniques to increase parameter access locality (PAL), achieving up to linear speed-ups. We found that existing parameter servers provide only limited support for PAL techniques, however, and therefore prevent efficient training. In this paper, we explore whether and to what extent PAL techniques can be supported, and whether such support is beneficial. We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse, and experimentally compare its performance to existing parameter servers across a number of machine learning tasks. We found that Lapse provides near linear scaling and can be orders of magnitude faster than existing parameter servers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2021

Replicate or Relocate? Non-Uniform Access in Parameter Servers

Parameter servers (PSs) facilitate the implementation of distributed tra...
research
06/01/2022

Good Intentions: Adaptive Parameter Servers via Intent Signaling

Parameter servers (PSs) ease the implementation of distributed training ...
research
05/10/2023

Access-Redundancy Tradeoffs in Quantized Linear Computations

Linear real-valued computations over distributed datasets are common in ...
research
08/06/2021

Toward Efficient Online Scheduling for Distributed Machine Learning Systems

Recent years have witnessed a rapid growth of distributed machine learni...
research
07/28/2023

Empirical Study of Straggler Problem in Parameter Server on Iterative Convergent Distributed Machine Learning

The purpose of this study is to test the effectiveness of current stragg...
research
02/14/2018

Sub-logarithmic Distributed Oblivious RAM with Small Block Size

Oblivious RAM (ORAM) is a cryptographic primitive that allows a client t...
research
05/17/2022

IIsy: Practical In-Network Classification

The rat race between user-generated data and data-processing systems is ...

Please sign up or login with your details

Forgot password? Click here to reset