The objective of this thesis is the development of classification models using evolutionary algorithms, focusing on the aspects of scalability, interpretability and accuracy in complex datasets and high dimensionality.
This Ph.D. thesis presents new computational models on data classification which address new open problems and challenges in data classification by means of evolutionary algorithms. Specifically, we pursue to improve the performance, scalability, interpretability and accuracy of classification models on challenging data. The performance and scalability of evolutionary-based classification models were improved through parallel computation on GPUs, which demonstrated to achieve high efficiency on speeding up classification algorithms.
The conflicting problem of the interpretability and accuracy of the classification models was addressed through a highly interpretable classification algorithm which produced very comprehensible classifiers by means of classification rules. Performance on challenging data such as the imbalanced classification was improved by means of a data gravitation classification algorithm which demonstrated to achieve better classification performance both on balanced and imbalanced data.
All the methods proposed in this thesis were evaluated in a proper experimental framework, by using a large number of data sets with diverse dimensionality and by comparing their performance against other state-of-the-art and recently published methods of proved quality. The experimental results obtained have been verified by applying non-parametric statistical tests which support the better performance of the methods proposed.
The development of this thesis was supported by:
- Spanish Ministry of Science and Technology, project TIN2011-22408.
- Regional Government of Andalusia, project P08-TIC-3720.
- Spanish Ministry of Education under the FPU program (AP-2010-0042)
PUBLICATIONS ASSOCIATED WITH THIS THESIS
- A. Cano, A. Zafra, and S. Ventura. Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Computing, vol. 16(2), pp. 187-202. 2012.
- A. Cano, A. Zafra, and S. Ventura. An Interpretable Classification Rule Mining Algorithm. Information Sciences, vol. 240, pp. 1-20. 2013.
- A. Cano, A. Zafra, and S. Ventura. Parallel evaluation of Pittsburgh rule-based classifiers on GPUs. Neurocomputing, vol. 126, pp. 45-57. 2014.
- A. Cano, A. Zafra, and S. Ventura. Weighted Data Gravitation Classification for Standard and Imbalanced Data. IEEE Transactions on Cybernetics, vol. 43(6), pp. 1672-1687. 2014.
- A. Cano, J.M. Luna, A. Zafra, and S. Ventura. A Classification Module for Genetic Programming Algorithms in JCLEC.Journal of Machine Learning Research, vol. 16, pp. 491-494, 2015.
- A. Cano, A. Zafra, and S. Ventura. Speeding up multiple instance learning classification rules on GPUs. Knowledge and Information Systems, vol. 44(1), pp. 127-145, 2015.
- A. Cano, A. Zafra, and S. Ventura. Solving classification problems using genetic programming algorithms on GPUs.Proceedings of the 5th International Conference on Hybrid Artificial Intelligent Systems (HAIS’10), Lecture Notes in Computer Science, vol. 6077 LNAI(PART 2), pp. 17-26, 2010.
- A. Cano, J.M. Luna, J.L. Olmo, and S. Ventura. JCLEC meets WEKA! Proceedings of the 6th International Conference on Hybrid Artificial Intelligent Systems (HAIS’11), Lecture Notes in Computer Science, vol. 6678 LNAI(PART 1), pp. 388-395, 2011.
- A. Cano, A. Zafra, and S. Ventura. A parallel genetic programming algorithm for classification. Proceedings of the 6th International Conference on Hybrid Artificial Intelligent Systems (HAIS’11), Lecture Notes in Computer Science, vol. 6678 LNAI(PART 1), pp. 172-181, 2011.
- A. Cano, A. Zafra, and S. Ventura. An EP algorithm for learning highly interpretable classifiers. Proceedings of the 11th International Conference on Intelligent Systems Design and Applications (ISDA’11), pp. 325-330, 2011.
- A. Cano, A. Zafra, E.L. Gibaja, and S. Ventura. A Grammar-Guided Genetic Programming Algorithm for Multi-Label Classification. Proceedings of the 16th European Conference on Genetic Programming (EuroGP’13), Lecture Notes in Computer Science, vol. 7831, pp. 217-228, 2013.