Imprimir

Partial order label decomposition approaches for melanoma diagnosis

This page provides datasets as well as links to machine learning methods implementation of the paper entitled Partial order label decomposition approaches for melanoma diagnosis accepted in Applied Soft Computing. If you use these datasets and methods please properly cite the associated publication.

1. Citation details

J. Sánchez-Monedero, M. Pérez-Ortiz, A. Sáez, , P. A. Gutiérrez, and C. Hervás-Martínez, Partial order label decomposition approaches for melanoma diagnosis, Applied Soft Computing, Volume 64, March 2018, Pages 341-355. https://doi.org/10.1016/j.asoc.2017.11.042

2. Abstract of the paper

Melanoma is a type of cancer that develops from the pigment-containing cells known as melanocytes. Usually occurring on the skin, early detection and diagnosis is strongly related to survival rates. Melanoma recognition is a challenging task that nowadays is performed by well trained dermatologists who may produce varying diagnosis due to the task complexity. This motivates the development of automated diagnosis tools, in spite of the inherent difficulties (intra-class variation, visual similarity between melanoma and non-melanoma lesions, among others). In the present work, we propose a system combining image analysis and machine learning to detect melanoma presence and severity. The severity is assessed in terms of melanoma thickness, which is measured by the Breslow index. Previous works mainly focus on the binary problem of detecting the presence of the melanoma. However, the system proposed in this paper goes a step further by also considering the stage of the lesion in the classification task. To do so, we extract 100 features that consider the shape, colour, pigment network and texture of the benign and malignant lesions. The problem is tackled as a five-class classification problem, where the first class represents benign lesions, and the remaining four classes represent the different stages of the melanoma (via the Breslow index). Based on the problem definition, we identify the learning setting as a partial order problem, in which the patterns belonging to the different melanoma stages present an order relationship, but where there is no order arrangement with respect to the benign lesions. Under this assumption about the class topology, we design several proposals to exploit this structure and improve data preprocessing. In this sense, we experimentally demonstrate that those proposals exploiting the partial order assumption achieve better performance than 12 baseline nominal and ordinal classifiers (including a deep learning model) which do not consider this partial order. To deal with class imbalance, we additionally propose specific over-sampling techniques that consider the structure of the problem for the creation of synthetic patterns. The experimental study is carried out with clinician-curated images from the Interactive Atlas of Dermoscopy, which eases reproducibility of experiments. Concerning the results obtained, in spite of having augmented the complexity of the classification problem with more classes, the performance of our proposals in the binary problem is similar to the one reported in the literature.

3. Datasets

This section includes the dataset corresponding used in the paper. We include the whole dataset and the 10-fold partitions:

Each file contains one folder for each dataset containing the10-fold train and generalization (test) partitions. Each partition is in two file formats:

  • matlab: files used by ORCA framework.
  • weka: Weka file format.

4. Links to nominal and ordinal classification implementation

We use the ORCA (Ordinal Regression and Classification Algorithms), which is MATLAB /Octave framework including all the methods used in the experiments with the exception of TensorFlow. The dataset is included in ORCA repository.

The links to the specific implementation of the partial ordering method are: