Analysis of software aging in a Web server

TitleAnalysis of software aging in a Web server
Publication TypeJournal Article
Year of Publication2006
AuthorsM Grottke, L Li, K Vaidyanathan, and KS Trivedi
JournalIEEE Transactions on Reliability
Volume55
Issue3
Start Page411
Pagination411 - 420
Date Published09/2006
Abstract

Several recent studies have reported & examined the phenomenon that long-running software systems show an increasing failure rate and/or a progressive degradation of their performance. Causes of this phenomenon, which has been referred to as "software aging", are the accumulation of internal error conditions, and the depletion of operating system resources. A proactive technique called "software rejuvenation" has been proposed as a way to counteract software aging. It involves occasionally terminating the software application, cleaning its internal state and/or its environment, and then restarting it. Due to the costs incurred by software rejuvenation, an important question is when to schedule this action. While periodic rejuvenation at constant time intervals is straightforward to implement, it may not yield the best results. The rate at which software ages is usually not constant, but it depends on the time-varying system workload. Software rejuvenation should therefore be planned & initiated in the face of the actual system behavior. This requires the measurement, analysis, and prediction of system resource usage. In this paper, we study the development of resource usage in a web server while subjecting it to an artificial workload. We first collect data on several system resource usage & activity parameters. Non-parametric statistical methods arc then applied toward detecting & estimating trends in the data sets. Finally, we fit time scries models to the data collected. Unlike the models used previously in the research on software aging, these time scries models allow for seasonal patterns, and we show how the exploitation of the seasonal variation can help in adequately predicting the future resource usage. Based on the models employed here, proactive management techniques like software rejuvenation triggered by actual measurements can be built. © 2006 IEEE.

DOI10.1109/TR.2006.879609
Short TitleIEEE Transactions on Reliability