Abstract
This paper introduces orthogonal decision trees that offer an effective way to construct a redundancy-free, accurate, and meaningful representation of large decision-tree-ensembles often created by popular techniques such as Bagging, Boosting, Random Forests, and many distributed and data stream mining algorithms. Orthogonal decision trees are functionally orthogonal to each other and they correspond to the principal components of the underlying function space. This paper offers a technique to construct such trees based on the Fourier transformation of decision trees and eigen-analysis of the ensemble in the Fourier representation. It offers experimental results to document the performance of orthogonal trees on the grounds of accuracy and model complexity.
Original language | English |
---|---|
Article number | 1644727 |
Pages (from-to) | 1028-1042 |
Number of pages | 15 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 18 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2006 |
Funding
The authors acknowledge supports from the US National Science Foundation CAREER award IIS-0093353, US National Science Foundation grant IIS-0203958, and NASA grant NAS2-37143. The work of B.-H. Park was partially funded by the Scientific Data Management Center (http:// sdmcenter.lbl.gov) under the Department of Energy’s Scientific Discovery through Advanced Computing (DOE SciDAC) program (http://www.scidac.org). H. Kargupta is also affiliated to Agnik, LLC., Columbia, MD. A four-page version of this paper was published in the Proceedings of the 2004 IEEE International Conference on Data Mining.
Funders | Funder number |
---|---|
Scientific Data Management Center | |
US National Science Foundation | IIS-0203958 |
National Science Foundation | IIS-0093353 |
U.S. Department of Energy | |
National Aeronautics and Space Administration | NAS2-37143 |
Keywords
- Fourier transform
- Orthogonal decision trees
- Principle component analysis
- Redundancy free trees