Resilience in computer systems and networks

TitleResilience in computer systems and networks
Publication TypeJournal Article
Year of Publication2009
AuthorsKS Trivedi, DS Kim, and R Ghosh
JournalIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
Start Page74
Pagination74 - 77
Date Published01/2009
Abstract

The term resilience is used differently by different communities. In general engineering systems, fast recovery from a degraded system state is often termed as resilience. Computer networking community defines it as the combination of trustworthiness (dependability, security, performability) and tolerance (survivability, disruption tolerance, and traffic tolerance). Dependable computing community defined resilience as the persistence of service delivery that can justifiably be trusted, when facing changes. In this paper, resilience definitions of systems and networks will be presented. Metrics for resilience will be compared with dependability metrics such as availability, performance, performability. Simple examples will be used to show quantification of resilience via probabilistic analytic models. Copyright 2009 ACM.

DOI10.1145/1687399.1687415
Short TitleIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD