This website contains additional material to the paper titled Data mining approach improves classification accuracy of HCV infection outcome.
Abstract
Background: The dataset from genes used for the prediction of HCV outcome was evaluated in a previous study by means of conventional statistical methodology.
Objective: The aim of this study was to reanalyze this same dataset using the data mining approach in order to find models that improve the classification accuracy of the genes studied.
Methods: We built predictive models using different subsets of factors, which were selected according to their importance in predicting the patient classification. Then, we evaluate not only each independent model but also a combination of them, leading to a better predictive model.
Results: Our data mining approach identified genetic patterns that escaped detection by the conventional statistics. Specifically, PART and ENSEMBLE models increased the classification accuracy of HCV outcome compared to conventional methods.
Conclusions: Data mining can be used more extensively in biomedicine, facilitating knowledge building and management of human diseases.
Data
The dataset used in this study was obtained from a previous paper published in 2018 [Fri18], based on a study carried out between 2013 and 2017 of a total of 138 individuals, all of whom were HIV/HCV co-infected patients from the Infectious Diseases Unit at the Hospital Reina Sofía in Cordoba (Spain). The patients were categorized as chronic hepatitis C (CHC) or spontaneous resolution (SR). The dataset comprises 43 different input features from different markers in every patient. The markers were IFNL3 genotype (1 feature), HLA-B (17 features), epitope Bw (1 feature), HLA-C (12 feature), and KIR genotype (12 features). A total of 46 out of 138 patients included in this study had missing values in any of the features. The dataset can be downloaded in the link below.
References
[Fri18] M. Frias, A. Rivero-Juárez, D. Rodriguez-Cano, A. Camacho, P. López-López, M. A. Risalde, B. Manzanares-Martín, T. Brieva, I. Machuca, and A. Rivero. (2018) HLA-B, HLA-C and KIR improve the predictive value of IFNL3 for Hepatitis C spontaneous clearance. Scientific Reports, 8(1), 1-7.