Minimizing completion time of a program by checkpointing and rejuvenation

TitleMinimizing completion time of a program by checkpointing and rejuvenation
Publication TypeJournal Article
Year of Publication1996
AuthorsS Garg, C Kintala, Y Huang, and KS Trivedi
JournalPerformance Evaluation Review
Volume24
Issue1
Start Page252
Pagination252 - 261
Date Published01/1996
Abstract

Checkpointing with rollback-recovery is a well known technique to reduce the completion time of a program in the presence of failures. While checkpointing is corrective in nature, rejuvenation refers to preventive maintenance of software aimed to reduce unexpected failures mostly resulting from the aging phenomenon. In this paper, we show how both these techniques may be used together to further reduce the expected completion time of a program. The idea of using checkpoints to reduce the amount of rollback upon a failure is taken a step further by combining it with rejuvenation. We derive the equations for expected completion time of a program with finite failure free running time for the following three cases when; (a) neither checkpointing nor rejuvenation is employed, (b) only checkpointing is employed, and finally (c) both checkpointing and rejuvenation are employed. We also present numerical results for Weibull failure time distribution for the above three cases and discuss optimal checkpointing and rejuvenation that minimizes the expected completion time. Using the numerical results, some interesting conclusions are drawn about benefits of these techniques in relation to the nature of failure distribution.

DOI10.1145/233008.233050
Short TitlePerformance Evaluation Review