BigDL 2.0: Seamless Scaling of AI Pipelines from Laptops to Distributed Cluster

by   Jason Dai, et al.

Most AI projects start with a Python notebook running on a single laptop; however, one usually needs to go through a mountain of pains to scale it to handle larger dataset (for both experimentation and production deployment). These usually entail many manual and error-prone steps for the data scientists to fully take advantage of the available hardware resources (e.g., SIMD instructions, multi-processing, quantization, memory allocation optimization, data partitioning, distributed computing, etc.). To address this challenge, we have open sourced BigDL 2.0 at under Apache 2.0 license (combining the original BigDL and Analytics Zoo projects); using BigDL 2.0, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then be transparently accelerated on a single node (with up-to 9.6x speedup in our experiments), and seamlessly scaled out to a large cluster (across several hundreds servers in real-world use cases). BigDL 2.0 has already been adopted by many real-world users (such as Mastercard, Burger King, Inspur, etc.) in production.


page 2

page 3

page 5


PyTorrent: A Python Library Corpus for Large-scale Language Models

A large scale collection of both semantic and natural language resources...

HiQ – A Declarative, Non-intrusive, Dynamic and Transparent Observability and Optimization System

This paper proposes a non-intrusive, declarative, dynamic and transparen...

A Distributed Multi-GPU System for Large-Scale Node Embedding at Tencent

Scaling node embedding systems to efficiently process networks in real-w...

Cloud-native RStudio on Kubernetes for Hopsworks

In order to fully benefit from cloud computing, services are designed fo...

Revisiting Process versus Product Metrics: a Large Scale Analysis

Numerous methods can build predictive models from software data. But wha...

Open Data on GitHub: Unlocking the Potential of AI

GitHub is the world's largest platform for collaborative software develo...

LightOn Optical Processing Unit: Scaling-up AI and HPC with a Non von Neumann co-processor

We introduce LightOn's Optical Processing Unit (OPU), the first photonic...

Please sign up or login with your details

Forgot password? Click here to reset