Musk2 Data Set
Description
The problem consists of determining whether a drug molecule will bind strongly to a target protein. Each molecule may adopt a wide range of shapes or conformations. A positive molecule has at least one shape that can bind well (although it is not known which one) and a negative molecule means none of its shapes can make the molecule bind well. This problem could be represented in a very natural way in MIL settings: each molecule would be a bag and the conformations it can adopt would be the instances in that bag.
Dataset
The original data set is partitioned using 10-fold cross-validation procedure five times. Thus, five different partitions of 10-fold cross validation are available
10-fold cross validation |
Files |
Procedure 1 | musk2-10-proc1.arff |
Procedure 2 | musk2-10-proc2.arff |
Procedure 3 | musk2-10-proc3.arff |
Procedure 4 | musk2-10-proc4.arff |
Procedure 5 | musk2-10-proc5.arff |