Workload Characterization of Autonomic DBMSs using Statistical and Data mining techniques

dc.contributor.advisorDenko, Meiso(PhD)
dc.contributor.authorZewdu, Zerihun
dc.date.accessioned2018-06-26T13:49:56Z
dc.date.accessioned2023-11-29T04:05:54Z
dc.date.available2018-06-26T13:49:56Z
dc.date.available2023-11-29T04:05:54Z
dc.date.issued2008-09
dc.description.abstractAutonomic configuration is one of the most important components of an autonomic system. Database Management Systems (DBMSs) are one of the areas where autonomic configuration is highly required. In order for a DBMS to configure itself on changing external workloads, it should be able to detect and classify the workloads into their dominant categories, mainly into DSS (Decision Support Systems) and OLTP (Online Transaction Processing). Previous research works in this area have proposed a methodology for classification of workloads. But the tests are performed using limited algorithms and on only one commercial DBMS. In this thesis a model where an autonomic DBMS can identify and characterize the type of workload acting up on it is developed and the most important database status variables which are highly affected by changing workloads are identified. This is important for a self configuring autonomic DBMS because it needs to reconfigure itself based on identified changing workloads. Two algorithms are selected for database workload classification: hierarchical clustering and classification & regression tree for classifying database workloads after running database workloads from TPC benchmark queries and transactions. The costs of these workloads are measured in terms of status variables of the selected DBMS (MySQL). These costs are used to show whether a workload is DSS or OLTP using the selected classification algorithms. After a set of extensive experiments and analyses, we have found out that all the DBMS status variables are not equally important in classifying the collected workloads. In fact, some of the workloads do not have a significant relevance apart from increasing the classification complexity. We have identified these variables and listed them in this thesis. Even though both the selected classification algorithms are good at classifying the collected workloads, hierarchical clustering algorithm has an additional advantage of showing the degree of correlation among clusters. This can be important in the area of database workload shift detection.en_US
dc.identifier.urihttp://etd.aau.edu.et/handle/123456789/3776
dc.language.isoenen_US
dc.publisherAddis Ababa Universityen_US
dc.subjectWorkload Characterizationen_US
dc.titleWorkload Characterization of Autonomic DBMSs using Statistical and Data mining techniquesen_US
dc.typeThesisen_US

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Zerihun Zewdu.pdf
Size:
310.73 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: