Application of data mining technology for rainfall forecasting using ensemble methods

No Thumbnail Available



Journal Title

Journal ISSN

Volume Title




Rainfall is one the natural phenomena which must be known to use it as a resource or avoid its undesirable effects. Rainfall predictive models are used to find potentially valuable patterns in the data, or to predict the outcome of any event. The choice and use of predictive technique to use becomes even harder, since no technique outperforms all others over a large set of problems. It is even difficult to find the best parameter values for a specific technique, since these settings also problem dependent. Ensembles are more robust and powerful than individual models, in this thesis ensemble diversity used to solve wide range of rainfall prediction problems. The main objective of this study is appraising the potential applicability of data mining technology and machine learning to predict rainfall using ensemble and single algorithms. The other contributions are improvements of a rule extraction technique, resulting in increased comprehensibility and more accurate result by ensemble machine learning. On the other hand, in this study the researcher used hybrid data mining methodology. Also, the researcher used 9,543 instances with 8 selected and 10 all attributes on WEKA 3.8. In this thesis, J48, PART, MLP and IBK algorithms are used with ensemble method. In addition VB.NET 2010 is used for interface development to use the discovered knowledge. Furthermore, PART algorithm and J48 decision tree demonstrated on 10 fold cross validation method given best result rather than MLP and IBK. Ensemble PART algorithm & J48 Decision tree with selected attributes produced 95.46 % and 95.44% prediction accuracy respectively, on WEKA experimenter for one day advance prediction. However, when we used the ensemble method, boosting ensemble was given better result rather than bagging and staking. Ensemble PART algorithm for one month advance prediction using all attributes produced 95.35 % and 97.12% accuracy, on WEKA experimenter and explorer respectively. The researcher found temperature, humidity, wind speed, sunshine, month and year as the major variables to predict rainfall. Beyond this it’s possible to extend the development of the model to a longer forecast such as ten days ahead and one year ahead. The researcher recommended other researcher’s to predict other atmospheric variable like wind speed, humidity, and temperature. Furthermore, the researcher also recommended other researcher’s to include association rule discovery to found strong internal relationship among meteorological variables.


A Thesis Submitted to the School of Information Science Presented in Partial Fulfillment of the Requirements for the Degree of Master of Science in Information Science


Data, Data mining technology, Rainfall forecasting