Machine learning methods for binary and multiclass classification of melanoma thickness from dermoscopic images

This page provides datasets as well as links to machine learning methods implementation of the paper entitled Machine learning methods for binary and multiclass classification of melanoma thickness from dermoscopic images published in IEEE Transactions on Medical Imaging. If you use these datasets please properly cite the associated publication.

1. Citation details

Aurora Sáez, Javier Sánchez-Monedero, Pedro Antonio Gutiérreza and César Hervás-Martínez, Machine learning methods for binary and multiclass classification of melanoma thickness from dermoscopic images, IEEE Transactions on Medical Imaging, pp 1036-1045, Volume 35, Issue 4, 2016. DOI: 10.1109/TMI.2015.2506270

2. Abstract of the paper

Abstract—Thickness of the melanoma is the most important factor associated with survival in patients with melanoma. It is most commonly reported as a measurement of depth given in millimeters (mm), and computed by means of pathological examination after a biopsy of the suspected lesion. In order to avoid the use a invasive method in the estimation of the thickness of melanoma before surgery, we propose a computational image analysis system from dermoscopic images. The proposed feature extraction is based on the clinical findings that correlate certain characteristics present in dermoscopic images and tumor depth. Two supervised classification schemes are proposed: a binary classification in which melanomas are classified into thin or thick, and a three-classes scheme (thin, intermediate, and thick). The performance of several nominal classification methods, among them a recent interpretable method combining logistic regression with artificial neural networks (Logistic regression using Initial variables and Product Units, LIPU), is compared. For the three classes problem, a set of ordinal classification methods (considering ordering relation between the three classes) is included. For the binary case, LIPU outperforms the other methods with an accuracy of 77.6%, and for the second scheme, the ordinal classification methods achieve a better balance between the accuracies obtained for all classes.

3. Datasets

This section includes the datasets corresponding to binary and ordinal versions of the problem. We include the whole dataset and the 10-fold partitions:

Melanoma Dataset including 10-fold partitions (ZIP compressed file)

Each file contains one folder for each dataset containing the10-fold train and generalization (test) partitions. Each partition is in three file formats:

matlab: files used by ORCA framework.
weka: Weka file format.
nnep: JCLEC-NNEP file format (file format description available at Partitions and Source Code section of AYRNA's website)

4. Links to nominal and ordinal classification implementation

We use the ORCA (Ordinal Regression and Classification Algorithms) which is MATLAB framework for the following methods:

Kernel Discriminat Analisys (KDA)
Support Vector Machine for Classification (SVC)
Support Vector Ordinal Regression with implicit constraints (SVORIM)
RED-SVM which applies the reduction from cost-sensitive ordinal ranking to weighted binary classification (RED) framework to SVM
Kernel Discriminant Learning for Ordinal Regression (KDLOR)

For Logistic regression using Initial variables and Product Units (LIPU) and Product Units Neural Network (PUNN) we used source code available at http://www.uco.es/grupos/ayrna/en/partitions-and-datasets/#paguitierrez2011ieeetnn.

For Logistic Regresion (LR), we used SimpleLogistic implementation available at Weka.

5. Confusion matrices

As supplementary experimental information, we provide here generalization performance confusion matrices for all the methods in the paper:

Binary classification problem
LIPU
144	23
33	50
LR
144	23
39	44
PUNN
139	28
42	41
KDA
123	44
28	55
SVC
148	19
40	43

Ordinal classification problem
LIPU
154	12	1
40	10	4
9	13	7
LR
149	16	2
42	7	5
15	12	2
PUNN
148	13	6
43	8	3
14	9	6
KDLOR
115	47	5
19	30	5
3	10	16
SVC
144	16	7
33	14	7
11	10	8
REDSVM
125	38	4
28	21	5
6	13	10
SVORIM
124	41	2
25	25	4
8	11	10