Aggregation in the Mirror Space (AIMS): Fast, Accurate Distributed Machine Learning in Military Settings

by   Ryan Yang, et al.

Distributed machine learning (DML) can be an important capability for modern military to take advantage of data and devices distributed at multiple vantage points to adapt and learn. The existing distributed machine learning frameworks, however, cannot realize the full benefits of DML, because they are all based on the simple linear aggregation framework, but linear aggregation cannot handle the divergence challenges arising in military settings: the learning data at different devices can be heterogeneous (i.e., Non-IID data), leading to model divergence, but the ability for devices to communicate is substantially limited (i.e., weak connectivity due to sparse and dynamic communications), reducing the ability for devices to reconcile model divergence. In this paper, we introduce a novel DML framework called aggregation in the mirror space (AIMS) that allows a DML system to introduce a general mirror function to map a model into a mirror space to conduct aggregation and gradient descent. Adapting the convexity of the mirror function according to the divergence force, AIMS allows automatic optimization of DML. We conduct both rigorous analysis and extensive experimental evaluations to demonstrate the benefits of AIMS. For example, we prove that AIMS achieves a loss of O((m^r+1/T)^1/r) after T network-wide updates, where m is the number of devices and r the convexity of the mirror function, with existing linear aggregation frameworks being a special case with r=2. Our experimental evaluations using EMANE (Extendable Mobile Ad-hoc Network Emulator) for military communications settings show similar results: AIMS can improve DML convergence rate by up to 57% and scale well to more devices with weak connectivity, all with little additional computation overhead compared to traditional linear aggregation.


Achieving Efficient Distributed Machine Learning Using a Novel Non-Linear Class of Aggregation Functions

Distributed machine learning (DML) over time-varying networks can be an ...

Event-Triggered Decentralized Federated Learning over Resource-Constrained Edge Devices

Federated learning (FL) is a technique for distributed machine learning ...

DRAG: Divergence-based Adaptive Aggregation in Federated learning on Non-IID Data

Local stochastic gradient descent (SGD) is a fundamental approach in ach...

Distributed Momentum for Byzantine-resilient Learning

Momentum is a variant of gradient descent that has been proposed for its...

Randomized Communication Without Network Knowledge

Radio networks are a long-studied model for distributed system of device...

Communication-Efficient Federated Learning Using Censored Heavy Ball Descent

Distributed machine learning enables scalability and computational offlo...

Contextual Model Aggregation for Fast and Robust Federated Learning in Edge Computing

Federated learning is a prime candidate for distributed machine learning...

Please sign up or login with your details

Forgot password? Click here to reset