FaaSter Troubleshooting – Evaluating Distributed Tracing Approaches for Serverless Applications

10/07/2021
by   Maria C. Borges, et al.
0

Serverless applications can be particularly difficult to troubleshoot, as these applications are often composed of various managed and partly managed services. Faults are often unpredictable and can occur at multiple points, even in simple compositions. Each additional function or service in a serverless composition introduces a new possible fault source and a new layer to obfuscate faults. Currently, serverless platforms offer only limited support for identifying runtime faults. Developers looking to observe their serverless compositions often have to rely on scattered logs and ambiguous error messages to pinpoint root causes. In this paper, we investigate the use of distributed tracing for improving the observability of faults in serverless applications. To this end, we first introduce a model for characterizing fault observability, then provide a prototypical tracing implementation - specifically, a developer-driven and a platform-supported tracing approach. We compare both approaches with our model, measure associated trade-offs (execution latency, resource utilization), and contribute new insights for troubleshooting serverless compositions.

READ FULL TEXT
research
03/28/2019

Co-evolving Tracing and Fault Injection with Box of Pain

Distributed systems are hard to reason about largely because of uncertai...
research
10/20/2017

Hardened Paxos Through Consistency Validation

Due to the emergent adoption of distributed systems when building applic...
research
02/18/2020

Decentralized Validation for Non-malicious Arbitrary Fault Tolerance in Paxos

Fault-tolerant distributed systems offer high reliability because even i...
research
06/04/2023

Learning Test-Mutant Relationship for Accurate Fault Localisation

Context: Automated fault localisation aims to assist developers in the t...
research
05/28/2021

SPFA: SFA on Multiple Persistent Faults

For classical fault analysis, a transient fault is required to be inject...
research
11/06/2019

A Language-based Serverless Function Accelerator

Serverless computing is an approach to cloud computing that allows progr...
research
07/17/2018

User Manual for the Apple CoreCapture Framework

CoreCapture is Apple's primary logging and tracing framework for IEEE 80...

Please sign up or login with your details

Forgot password? Click here to reset