Streaming Big Data Analytic Platform for Unified Log Management and Monitoring
dc.contributor.advisor | Teferi, Dereje (PhD) | |
dc.contributor.author | Sholaye, Muluken | |
dc.date.accessioned | 2022-04-07T12:44:30Z | |
dc.date.accessioned | 2023-11-18T12:48:38Z | |
dc.date.available | 2022-04-07T12:44:30Z | |
dc.date.available | 2023-11-18T12:48:38Z | |
dc.date.issued | 2021-09-10 | |
dc.description.abstract | Over the past 20 years data has increased in a large scale and in various fields. Managing the produced data and gaining insights from it is often a challenge and possibly a key to competitive advantage, consequently forcing industries to find ways to collect and integrate massive data from widely distributed data sources. Nowadays organizations are overwhelmed by external data sources on the top of their internal data source. The challenges that organizations are facing is multifaceted such as the availability of big storage space (volume), the speed at which data creations takes place (velocity) and diversification of data types (variety). It is mandatory for an organization to always know the overall posture of their IS systems and overall IT infrastructure at any given time. One way to always know the state of an environment is through system logs. Although logs are abundant and very rich source of information, they are mostly overlooked due to complexity and volume of the contents of the log. The main objective of this research work is to design and demonstrate a generic service oriented big data architectural solution framework for efficient and fault tolerant log management and near-real-time monitoring of selected standard log types of syslog, metric log, net flow and audit log files. An integrated and layered stack of Beats data shippers, Apache Kafka, Apache Spark, Elastic search indexer and Kibana Visualization Toolkit are the main technologies used in the solution framework and Design Science Research (DSR) Methodology is applied. Openstack Mirantis based Cloud system of 4 interconnected nodes is prepared to deploy the proposed Log management system. The above mentioned log files contents will be redirected to the Streaming log analytics engine in real-time. Performance, Scalability and Fault Tolerance are the main evaluation metrics used along with assessment against results obtained by similar research works. The result of the research work shows that the proposed layered architecture of streaming technology stack has proven to visualize in average of less than 1 seconds of time. Data replication and buffering functionality provided by Apache Kafka has added fault tolerance for the system and scalability is as fast as installing client services of the technologies used in the prototype system | en_US |
dc.identifier.uri | http://etd.aau.edu.et/handle/12345678/31214 | |
dc.language.iso | en | en_US |
dc.publisher | Addis Ababa University | en_US |
dc.subject | Streaming Big Data | en_US |
dc.subject | Log Management System | en_US |
dc.subject | Big Data Analytics | en_US |
dc.subject | Real Time Monitoring | en_US |
dc.subject | Log Stream Processing | en_US |
dc.title | Streaming Big Data Analytic Platform for Unified Log Management and Monitoring | en_US |
dc.type | Thesis | en_US |