Elephant Data Set


This problem consist of identifying the intended target object(s) in images. The main difficulty is due to the fact that an image may contain multiple, possibly heterogeneous objects. Thus, the global description of a whole image is too coarse to achieve good classification and retrieval accuracy. Even if relevant images are provided, identifying which object(s) within the example images are relevant remains a hard problem in the supervised learning setting. However,
this problem fits in MIL settings well: each image can be treated as a bag of segments which are modeled as instances, and the concept point representing the target object can be learned through MIL algorithms. This data set considers data sets representing elephants. Each data set consists of 100 images which contains elephants and the other 100 images which contains another different animals. The final goal consist of distinguising images containing the elephants from those that do not contain it.

Dataset Partitions

The original data set is partitioned using 10-fold cross-validation procedure five times. Thus, five different partitions of 10-fold cross validation are available


10-fold cross validation
Procedure 1 elephant-10-proc1.arff
Procedure 2 elephant-10-proc2.arff
Procedure 3 elephant-10-proc3.arff
Procedure 4 elephant-10-proc4.arff
Procedure 5 elephant-10-proc5.arff