OS-level Failure Injection with SystemTap

02/05/2015
by   Camille Coti, et al.
0

Failure injection in distributed systems has been an important issue to experiment with robust, resilient distributed systems. In order to reproduce real-life conditions, parts of the application must be killed without letting the operating system close the existing network communications in a "clean" way. When a process is simply killed, the OS closes them. SystemTap is a an infrastructure that probes the Linux kernel's internal calls. If processes are killed at kernel-level, they can be destroyed without letting the OS do anything else. In this paper, we present a kernel-level failure injection system based on SystemTap. We present how it can be used to implement deterministic and probabilistic failure scenarios.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2019

Understanding Fault Scenarios and Impacts through Fault Injection Experiments in Cielo

We present a set of fault injection experiments performed on the ACES (L...
research
10/17/2022

Fault Injection based Failure Analysis of CentOS, Anolis OS and OpenEuler

The reliability of operating system (OS) has always been a major concern...
research
01/23/2021

Resilient Virtualized Systems Using ReHype

System-level virtualization introduces critical vulnerabilities to failu...
research
07/31/2022

Predicting Failure times for some Unobserved Events with Application to Real-Life Data

This study aims to predict failure times for some units in some lifetime...
research
09/30/2020

Fault Injection Analytics: A Novel Approach to Discover Failure Modes in Cloud-Computing Systems

Cloud computing systems fail in complex and unexpected ways due to unexp...
research
12/27/2018

TripleAgent: Monitoring, Perturbation And Failure-obliviousness for Automated Resilience Improvement in Java Applications

In this paper, we present a novel system for fault injection in producti...
research
05/30/2023

FERN: Leveraging Graph Attention Networks for Failure Evaluation and Robust Network Design

Robust network design, which aims to guarantee network availability unde...

Please sign up or login with your details

Forgot password? Click here to reset