CFP last date
02 December 2024
Call for Paper
January Edition
CAE solicits high quality original research papers for the upcoming January edition of the journal. The last date of research paper submission is 02 December 2024

Submit your paper
Know more
Reseach Article

Ensemble of Decision Tree Classifiers for Mining Web Data Streams

by Fauzia Yasmeen Tani, Dewan Md. Farid, Mohammad Zahidur
Communications on Applied Electronics
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 1
Year of Publication: 2014
Authors: Fauzia Yasmeen Tani, Dewan Md. Farid, Mohammad Zahidur
10.5120/65-0112

Fauzia Yasmeen Tani, Dewan Md. Farid, Mohammad Zahidur . Ensemble of Decision Tree Classifiers for Mining Web Data Streams. Communications on Applied Electronics. 1, 1 ( December 2014), 26-32. DOI=10.5120/65-0112

@article{ 10.5120/65-0112,
author = { Fauzia Yasmeen Tani, Dewan Md. Farid, Mohammad Zahidur },
title = { Ensemble of Decision Tree Classifiers for Mining Web Data Streams },
journal = { Communications on Applied Electronics },
issue_date = { December 2014 },
volume = { 1 },
number = { 1 },
month = { December },
year = { 2014 },
issn = { 2394-4714 },
pages = { 26-32 },
numpages = {9},
url = { https://www.caeaccess.org/archives/volume1/number1/65-0112/ },
doi = { 10.5120/65-0112 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-09-04T18:37:55.713248+05:30
%A Fauzia Yasmeen Tani
%A Dewan Md. Farid
%A Mohammad Zahidur
%T Ensemble of Decision Tree Classifiers for Mining Web Data Streams
%J Communications on Applied Electronics
%@ 2394-4714
%V 1
%N 1
%P 26-32
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The World Wide Web (www or w3 commonly known as the web) is the largest database available with growth at the rate of millions of pages a day and presents a challenging task for mining web data streams. Currently extraction of knowledge from web data streams is getting more and more complex, because the structure of data doesn’t match the attribute-values when considering the large volume of web data. In this paper, an ensemble of decision tree classifiers is presented, which is an efficient mining method to obtain a proper set of rules for extracting knowledge from a large amount of web data streams. We built a web server using Model 2 Architecture to collect the web data streams and applied the ensemble classifier for generating decision rules using several decision tree learning models. Experimental results demonstrate that the proposed method performs well in decision making and predicting the class value of new web data streams.

References
  1. S. Ruggieri, “Efficient C4.5,” IEEE Transactions on Knowledge and Data Engineering, Vol. 14, No. 2, 2002, pp. 438-444.
  2. Y. Zhao, and Y. Zhang, “Comparison of decision tree methods for finding active objects,” Advances in Space Research, Vol. 41, 2008, pp. 1955-1959.
  3. M. Hall, “A decision tree-based attribute weighting filter for naive Bays,” Knowledge-Based Systems, Vol. 20, 2007, pp. 120-126.
  4. L. M. Wang, X. L. Li, C. H. Cao, and S. M. Yuan, “Combining decision tree and Naive Bayes for classification,” Knowledge-Based Systems, Vol. 19, 2006, pp. 511-515.
  5. D. V. Patil, and R. S. Bichkar, “An optimistic data mining approach for handling large data set using data partitioning technique,” International Journal of Computer Applications, Vol. 24, No. 3, 2011, pp. 29-33.
  6. J. R. Quinlan, “Induction of Decision Tree,” Machine Learning, Vol. 1, No. 1, 1986 pp. 81-106.
  7. J. R. Quinlan, “Decision trees and multi-valued attributes,” In J.E. Hayes, D. Michie and J. Richards (eds.), Machine Intelligence, Vol. 11, 1988, pp. 305-318, Oxford, UK: Oxford University Press.
  8. L. Breiman, J. H. Friedman, R.A. Olshen, and C. J. Stone, “Classification and regression trees,” Statistics Probability Series, Wadsworth, Belmont, 1984.
  9. J. R. Quinlan, “C4.5: Programs for machine learning,” Morgan Kaufmann Publishers, San Mateo, California., 1993.
  10. S. K. Murthy, “Automatic construction of decision tress from data: A multidisciplinary survey,” Data and Knowledge Discovery, Vol. 2, No. 4, 1998, pp. 345-389.
  11. D. Turney, “Cost-Sensitive Classification: Empirical evaluation of a hybrid genetic decision tree induction algorithm,” Journal of Artificial Intelligence Research, 1995, pp. 369-409.
  12. W. Z. Liu, and A. P. White, “The importance of attribute selection in decision tree induction,” Machine Learning, Vol. 15, 1994, pp. 25-41.
  13. W. Buntine, and T. Niblett, “A further comparison of splitting rules for decision tree induction,” Machine Learning, Vol. 8, 1992, pp. 75-85.
  14. S. R. Safavian, and D. Langrebe, “A survey of decision tree classifier methodology,” IEEE Transactions on Systems, Man and Cybernetics, Vol. 21, No. 3, 1991, pp. 660-674.
  15. Dewan Md. Farid, Jerome Darmont, and Mohammad Zahidur Rahman, “Attribute Weighting with Adaptive NBTree for Reducing False Positives in Intrusion Detection,” International Journal of Computer Science and Information Security (IJCSIS), Vol. 8, No. 1, April 2010, pp. 19-26.
  16. Dewan Md. Farid, Mohammad Zahidur Rahman, and Chowdhury Mofizur Rahman, “An Ensemble Approach to Classifier Construction based on Bootstrap Aggregation,” International Journal of Computer Applications (IJCA), Vol. 25, No. 5, July 2011, pp. 30-34.
  17. Dewan Md. Farid, Mohammad Zahidur Rahman, and Chowdhury Mofizur Rahman, “Adaptive Intrusion Detection based on Boosting and Naïve Bayesian Classifier,” International Journal of Computer Applications (IJCA), Vol. 24, No. 3, June 2011, pp. 12-19.
  18. Dewan Md. Farid, Nouria Harbi, and Mohammad Zahidur Rahman, “Combining Naïve Bayes and Decision Tree for Adaptive Intrusion Detection,” International Journal of Network Security & Its Applications (IJNSA), Vol. 2, No. 2, April 2010, pp. 12-25.
  19. Dewan Md. Farid, Jerome Darmont, Nouria Harbi, and Chowdhury Mofizur Rahman, “A New Supervised Learning Algorithm using Naïve Bayesian Classifier,” In Proc. of the IADIS International Conference Information Systems 2010, 18-20 March, 2010, Porto, Portugal, pp. 78-84.
Index Terms

Computer Science
Information Sciences

Keywords

Data Streams Decision Tree J2EE Model 2 Architecture Web Server and Web Mining