IISc Logo    Title

etd AT Indian Institute of Science >
Division of Electrical Sciences >
Computer Science and Automation (csa) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/2005/1285

Title: Hard Drive Failure Prediction : A Rule Based Approach
Authors: Agrawal, Vipul
Advisors: Bhattacharyya, Chiranjib
Keywords: Hard Drive Failure Prediction
Hard Drive Verification
Rule-based Classifiers
Hard Disks (Computer Science)
Rule Discovery Methodology
Disk Events (Computer Science)
Rule-based Learning
Hard Drive Failures
Self Monitoring Analysis and Reporting Technology (SMART)
Submitted Date: Jul-2010
Series/Report no.: G23699
Abstract: The ability to accurately predict an impending hard disk failure is important for reliable storage system design. The facility provided by most hard drive manufacturers, called S.M.A.R.T. (self-monitoring, analysis and reporting technology), has been shown by current research to have poor predictive value. The problem of finding alternatives to S.M.A.R.T. for predicting disk failure is an area of active research. In this work, we present a rule discovery methodology, and show that it is possible to construct decision support systems that can detect such failures using information recorded from live disks. It is desired that any such prediction methodology should have high accuracy and must have ease of interpretability. Black box models can deliver highly accurate solutions but do not provide an understanding of events which explains the decision given by it. To this end we explore rule based classifiers for predicting hard disk failures from various disk events. We show that it is possible to learn easy to understand rules from disk events. Our evaluation shows that our system can be tuned either to have a high failure detection rate (i.e., classify a bad disk as bad) or to have a low false alarm rate (i.e., not classify a good disk as bad). We also propose a modification of MLRules algorithm for classification of data with imbalanced class distributions. The existing algorithm, assuming relatively balanced class distributions and equal misclassfication costs, performs poorly in classification of such datasets. The performance can be considerably improved by introducing cost- sensitive learning to the existing framework.
Abstract file URL: http://etd.ncsi.iisc.ernet.in/abstracts/1667/G23699-Abs.pdf
URI: http://etd.iisc.ernet.in/handle/2005/1285
Appears in Collections:Computer Science and Automation (csa)

Files in This Item:

File Description SizeFormat
G23699.pdf935.54 kBAdobe PDFView/Open

Items in etd@IISc are protected by copyright, with all rights reserved, unless otherwise indicated.


etd@IISc is a joint service of SERC & IISc Library ||
|| Powered by DSpace || Compliant to OAI-PMH V 2.0 and ETD-MS V 1.01