Web Usage Pattern Discovery and Analysis for Website Optimization: The Case of Ethio Telecom Official Website
No Thumbnail Available
Date
2015-05
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
In this ever growing Internet era, websites are becoming among the most important media for communicating with the stakeholders. Nowadays, many organizations realized that the need to investigate the behavior of their website users to meet their objectives through undertaking a research. Ethio telecom being a sole telecom operator and Internet service provider in Ethiopia should continuously assess and monitor its official website in order to analyze customers’ usage behavior and restructure the website accordingly. In this study, an attempt is made to discover useful patterns from the server log files of Ethio Telecom Official website using web usage mining.
The current research follows, the Web Usage Mining processes model suggested by Sharma [21] which consists of, data collection, data preprocessing, pattern discovery and pattern analysis. Server log data was used for pattern discovery and Google analytics reports were exported for statistically analyzing the website usage.
The access Log files exported from the web server cannot be used directly for web usage mining task as it may consist of large amount of irrelevant information. So preprocessing on web usage data is required to eliminate noisy data and make data effective for further analysis task. The applied preprocessing task includes data cleaning, session identification, feature selection, transaction identification and transformation of the data to Weka understandable format. Finally, a total of 301,580 transactions are used for the experiment.
After the preprocessing was completed, experiment was conducted with the datasets using Weka Software and FP-Growth algorithm to discover interesting patterns. Google Analytics and MS Excel were also employed to yield different useful statistical reports including, visitor’s location, top channels, top landing, top exit, and most frequently accessed pages. Some of the behavior identified from the statistical analysis shows the access rate of new visitors exceed the existing visitors, major visitors of the website are located locally and the home page is listed from the top list of landing as well as exit page.
The experimental findings indicate the existence of strong relationship between internet page and business page where all visitors who visited internet page also accessed business page and vice versa. Besides, these visitors who had visited the internet page and business page also visited troubleshooting page which implies that visitors are looking for a set up procedure from trouble shooting page after getting information about the available internet and business products from respective pages. Furthermore, the experimental findings also pointed out discussion pages, i.e.
2
Forum, general discussion and call conference pages are most frequently accessed together implying these pages need to be directly linked while restructuring the website.
Moreover, promotion is required to make the website more popular especially with respect to referrals and the website global usage. Finally, recommendations are forwarded for further researches and website reconstruction.
Keywords: Web Usage Mining, Pattern Discovery, FP-Growth, Google Analytics, Web log analysis.
Description
Keywords
Web Usage Mining, Pattern Discovery, FP-Growth, Google Analytics, Web log analysis