Title
Model predikcije otkaza magnetnih diskova zasnovan na detekciji anomalija
Creator
Đurašević, Slađana, 1985-
CONOR:
16102247
Copyright date
2025
Object Links
Select license
Autorstvo-Deliti pod istim uslovima 3.0 Srbija (CC BY-SA 3.0)
License description
Dozvoljavate umnožavanje, distribuciju i javno saopštavanje dela, i prerade, ako se navede ime autora na način odredjen od strane autora ili davaoca licence i ako se prerada distribuira pod istom ili sličnom licencom. Ova licenca dozvoljava komercijalnu upotrebu dela i prerada. Slična je softverskim licencama, odnosno licencama otvorenog koda. Osnovni opis Licence: http://creativecommons.org/licenses/by-sa/3.0/rs/deed.sr_LATN Sadržaj ugovora u celini: http://creativecommons.org/licenses/by-sa/3.0/rs/legalcode.sr-Latn
Language
Serbian
Cobiss-ID
Inventory ID
3828
Theses Type
Doktorska disertacija
description
Datum odbrane: 01.09.2025.
Other responsibilities
Academic Expertise
Tehničko-tehnološke nauke
University
Univerzitet u Kragujevcu
Faculty
Fakultet tehničkih nauka
Alternative title
Prediciton model for hard disk drive failures based on anomaly detection
Publisher
[S. M. Đurašević Pešović]
Format
115 listova
Abstract (sr)
Čovečanstvo svojim aktivnostima generiše ogromne količine digitalnih podataka
čija se veličina iz godine u godinu ubrzano povećava. Većina ovih podataka danas
specijalizovanim objektima centara podataka u kojima se nalaze računarski sistemi
koji se koriste za deljenje aplikacija i podataka. U njima se za skladištenje podataka
prvenstveno koriste magnetni diskove na kojima je uskladišteno oko 90% ukupne
količine podataka, dok se preostalih 10% podataka čuva na poluprovodničkim
diskovima. Glavna prednost magnetnih diskova je njihov veliki kapacitet i niska cena
uskladištenih podataka, što ih čini idealnim za opšte skladištenje podataka i
rezervne kopije podataka. Elektromehanički dizajn magnetnih diskova čini ih
podložnijim otkazima u odnosu na ostale komponente računarskog sistema. Kao
posledica otkaza magnetnog diska najčešće dolazi do gubitka korisničkih podataka,
čija ekonomska vrednost značajno nadmašuje cenu samog diska.
Upotrebom SMART (eng. Self-Monitoring, Analysis and Reporting Technology)
tehnologije, računarski sistem je u stanju da upozori korisnika ako je neki od radnih
parametara diska odstupi od unapred definisane vrednost praga. Metode mašinskog
učenja koriste prednost zavisnosti između više SMART atributa kako bi se
poboljšala stopa predviđanja otkaza diskova. U ovoj doktorskoj disertaciji
predstavljen je model predikcije otkaza diskova koji je zasnovan na metodi detekcije
anomalija. Rad diska može biti predstavljen u višedimenzionalnom prostoru kao niz
tačaka čija je pozicija definisana vrednostima njegovih SMART atributa. Tačke
podataka koje opisuju regularni rad diska će imati tendenciju da se grupišu oko
određene tačke definisane kao srednja vrednost ili centar mase, dok će tačke podataka
sa neispravnih diskova obično biti rasute dalje oko ovog centra mase. Upotrebom
Mahalanobisovog rastojanja, moguće je izmeriti udaljenost, izraženu u standardnim
odstupanjima, tačke podataka u višedimenzionalnom prostoru u odnosu na centar mase,
čime se uklanja uticaj skaliranja i korelacije između atributa. Korišćenjem podesive
granice odlučivanja obučen je model za detekciju anomalija tako da predvidi otkaze sa
najvećom mogućom stopom detekcije, pri čemu minimizira broj lažnih detekcija.
Realizovani model testiran je na CMRR skupu podataka tako što je metodom
rekurzivne eliminacije atributa kreiran optimalni skup od sedam najznačajnijih
atributa. Na pomenutom skupu podataka na deset nasumičnih testova ostvarena je
prosečna stopa detekcije otkaza od 96.11% , pri čemu nije bilo pogrešnih detekcija
otkaza. Predloženi model je predvideo više od 80% kvarova, 24 sata pre njihovog
stvarnog nastanka, što omogućava blagovremenu izradu rezervne kopije podataka što
opravdava njegovu primenu u praksi. Model za detekciju anomalija testiran je i na
poluprovodničkim diskovima na kojima je takođe postigao visok nivo prediktivnih
performansi.
Abstract ()
Humanity generates enormous amounts of digital data through its activities, the size of
which is increasing rapidly from year to year. Most of this data is now stored in specialized data
center facilities that house computer systems used to share applications and data. Magnetic
disks are primarily used for data storage, with about 90% of the total amount of data stored,
while the remaining 10% of the data is stored on semiconductor disks. The main advantage of
magnetic disks is their large capacity and low cost of stored data, which makes them ideal for
general data storage and data backup. The electromechanical design of magnetic disks makes
them more susceptible to failure than other components of a computer system. The failure of a
magnetic disk most often leads to the permanent loss of user data, the economic value of which
significantly exceeds the cost of the magnetic disk itself.
Using SMART (Self-Monitoring, Analysis and Reporting Technology) technology, a
computer system can alert the user if any of the disk's operating parameters deviate from a
predefined threshold value. Machine learning methods take advantage of the dependencies
between multiple SMART attributes to improve the disk failure prediction rate. This doctoral
dissertation presents a disk failure prediction model based on the anomaly detection method.
The operation of a disk can be presented in a multidimensional space with a series of points
whose positions are defined by the values of its SMART attributes. Data points describing
regular disk operation will tend to cluster around a certain point defined as the mean value or
center of mass, while data points from failed disks will usually be scattered further around this
center of mass. Using the Mahalanobis distance, it is possible to measure the distance of data
points in a multidimensional space from a point representing the center of mass expressed in
standard deviations, thereby removing the influence of scaling and correlation between
attributes. Using an adjustable decision boundary, an anomaly detection model was trained to
predict failures with the highest possible detection rate, while minimizing the number of false
detections.
The implemented model was tested on the CMRR dataset by creating an optimal set of the
seven most significant attributes using the recursive attribute elimination method. On the
aforementioned dataset, an average failure detection rate of 96.11% was achieved in ten random
tests, with no false detections of failures. The proposed model predicted more than 80% of
failures 24 hours before their actual occurrence, which allows for timely data backup, which
justifies its application in practice. The anomaly detection model was also tested on
semiconductor disks, where it also achieved a high level of predictive performance. , Humanity generates enormous amounts of digital data through its activities, the size of
which is increasing rapidly from year to year. Most of this data is now stored in specialized data
center facilities that house computer systems used to share applications and data. Magnetic
disks are primarily used for data storage, with about 90% of the total amount of data stored,
while the remaining 10% of the data is stored on semiconductor disks. The main advantage of
magnetic disks is their large capacity and low cost of stored data, which makes them ideal for
general data storage and data backup. The electromechanical design of magnetic disks makes
them more susceptible to failure than other components of a computer system. The failure of a
magnetic disk most often leads to the permanent loss of user data, the economic value of which
significantly exceeds the cost of the magnetic disk itself.
Using SMART (Self-Monitoring, Analysis and Reporting Technology) technology, a
computer system can alert the user if any of the disk's operating parameters deviate from a
predefined threshold value. Machine learning methods take advantage of the dependencies
between multiple SMART attributes to improve the disk failure prediction rate. This doctoral
dissertation presents a disk failure prediction model based on the anomaly detection method.
The operation of a disk can be presented in a multidimensional space with a series of points
whose positions are defined by the values of its SMART attributes. Data points describing
regular disk operation will tend to cluster around a certain point defined as the mean value or
center of mass, while data points from failed disks will usually be scattered further around this
center of mass. Using the Mahalanobis distance, it is possible to measure the distance of data
points in a multidimensional space from a point representing the center of mass expressed in
standard deviations, thereby removing the influence of scaling and correlation between
attributes. Using an adjustable decision boundary, an anomaly detection model was trained to
predict failures with the highest possible detection rate, while minimizing the number of false
detections.
The implemented model was tested on the CMRR dataset by creating an optimal set of the
seven most significant attributes using the recursive attribute elimination method. On the
aforementioned dataset, an average failure detection rate of 96.11% was achieved in ten random
tests, with no false detections of failures. The proposed model predicted more than 80% of
failures 24 hours before their actual occurrence, which allows for timely data backup, which
justifies its application in practice. The anomaly detection model was also tested on
semiconductor disks, where it also achieved a high level of predictive performance.
Authors Key words
magnetni diskovi, SMART atributi, detekcija anomalija,
predikcija otkaza, Mahalanobisovo rastojanje
Authors Key words
magnetic disks, SMART attributes, anomaly detection, failure prediction,
Mahalanobis distance
Classification
004.33(043.3)
Subject
Magnetni diskovi
Type
Tekst
“Data exchange” service offers individual users metadata transfer in several different formats. Citation formats are offered for transfers in texts as for the transfer into internet pages. Citation formats include permanent links that guarantee access to cited sources. For use are commonly structured metadata schemes : Dublin Core xml and ETUB-MS xml, local adaptation of international ETD-MS scheme intended for use in academic documents.

