Arvind Krishnamurthy

research

∙ 08/14/2023

Symphony: Optimized Model Serving using Centralized Orchestration

The orchestration of deep neural network (DNN) model inference on GPU cl...

0 Lequn Chen, et al. ∙

research

∙ 05/29/2023

Bandwidth Optimal Pipeline Schedule for Collective Communication

We present a strongly polynomial-time algorithm to generate bandwidth op...

0 Liangyu Zhao, et al. ∙

research

∙ 05/18/2023

TSoR: TCP Socket over RDMA Container Network for Cloud Native Computing

Cloud-native containerized applications constantly seek high-performance...

0 Yulin Sun, et al. ∙

research

∙ 07/02/2022

Dissecting Service Mesh Overheads

Service meshes play a central role in the modern application ecosystem b...

0 Xiangfeng Zhu, et al. ∙

research

∙ 02/07/2022

Optimal Direct-Connect Topologies for Collective Communications

We consider the problem of distilling optimal network topologies for col...

0 Liangyu Zhao, et al. ∙

research

∙ 09/16/2021

Disaggregating and Consolidating Network Functionalities with SuperNIC

Resource disaggregation has gained huge popularity in recent years. Exis...

0 Yizhou Shan, et al. ∙

research

∙ 05/28/2021

Cloud Collectives: Towards Cloud-aware Collectives forML Workloads with Rank Reordering

ML workloads are becoming increasingly popular in the cloud. Good cloud ...

16 Liang Luo, et al. ∙

research

∙ 05/22/2021

AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly

The learning rate (LR) schedule is one of the most important hyper-param...

0 Yuchen Jin, et al. ∙

research

∙ 11/29/2020

Srift: Swift and Thrift Cloud-Based Distributed Training

Cost-efficiency and training time are primary concerns in cloud-based di...

0 Liang Luo, et al. ∙

research

∙ 08/14/2020

Making Distributed Mobile Applications SAFE: Enforcing User Privacy Policies on Untrusted Applications with Secure Application Flow Enforcement

Today's mobile devices sense, collect, and store huge amounts of persona...

0 Adriana Szekeres, et al. ∙

research

∙ 01/22/2020

Talek: Private Group Messaging with Hidden Access Patterns

Talek is a private group messaging system that sends messages through po...

0 Raymond Cheng, et al. ∙

research

∙ 04/24/2019

DeepSense: Enabling Carrier Sense in Low-Power Wide Area Networks Using Deep Learning

The last few years have seen the proliferation of low-power wide area ne...

0 Justin Chan, et al. ∙

research

∙ 02/22/2019

Scaling Distributed Machine Learning with In-Network Aggregation

Training complex machine learning models in parallel is an increasingly ...

0 Amedeo Sapio, et al. ∙

research

∙ 12/05/2018

ADARES: Adaptive Resource Management for Virtual Machines

Virtual execution environments allow for consolidation of multiple appli...

0 Ignacio Cano, et al. ∙

research

∙ 07/11/2018

VTA: An Open Hardware-Software Stack for Deep Learning

Hardware acceleration is an enabler for ubiquitous and efficient deep le...

6 Thierry Moreau, et al. ∙

research

∙ 06/21/2018

Revisiting Network Support for RDMA

The advent of RoCE (RDMA over Converged Ethernet) has led to a significa...

0 Radhika Mittal, et al. ∙

research

∙ 05/21/2018

Learning to Optimize Tensor Programs

We introduce a learning-based framework to optimize tensor programs for ...

0 Tianqi Chen, et al. ∙

research

∙ 05/21/2018

Parameter Hub: a Rack-Scale Parameter Server for Distributed Deep Neural Network Training

Distributed deep neural network (DDNN) training constitutes an increasin...

0 Liang Luo, et al. ∙

research

∙ 04/18/2018

Volur: Concurrent Edge/Core Route Control in Data Center Networks

A perennial question in computer networks is where to place functionalit...

0 Qiao Zhang, et al. ∙

research

∙ 02/12/2018

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

There is an increasing need to bring machine learning to a wide diversit...

0 Tianqi Chen, et al. ∙

research

∙ 02/12/2018

TVM: End-to-End Optimization Stack for Deep Learning

Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch drive...

0 Tianqi Chen, et al. ∙

research

∙ 01/30/2018

Parameter Hub: High Performance Parameter Servers for Efficient Distributed Deep Neural Network Training

Most work in the deep learning systems community has focused on faster i...

0 Liang Luo, et al. ∙

research

∙ 11/20/2016

Fast Video Classification via Adaptive Cascading of Deep Models

Recent advances have enabled "oracle" classifiers that can classify acro...

0 Haichen Shen, et al. ∙

Arvind Krishnamurthy

Featured Co-authors

Sign in with Google

Consider DeepAI Pro