ONLINE LEARNING - MULTI-LABEL CLASSIFIER

Research Aim

The prime objective of this project is to develop a novel generic real-time multi-label classifier that performs uniformly well on datasets from a wide range of label density, label cardinality and application domains. Multi-label classification problems are far more complex than binary and multi-class classification problems, as both the number of target labels and each of the target labels corresponding to the input samples are to be identified. The application areas of multi-label classification are rapidly increasing due to its generality and several real-world applications require the need for multi-label classification. The developed technique serves as the base platform for the development of universal classifier.

Due to its increased complexity and wide variations in the characteristics of multi-label datasets based on label density and label cardinality, a classifier that performs well for one dataset might not perform well for a different dataset. Also, high-speed real-time classification of multi-label data is required for real-world applications.

Datasets

Yeast (Biology), Scene (Multimedia), Corel5k (Multimedia), Enron (Text), Medical (Text)

Dataset specifications are given below:

Algorithm

Code

The source code for progressive learning technique for multi-class classification is available in github (rajasekar-venkatesan/online-multilabel-classifier)

Results

More detailed discussions and results are available in the papers:

A Novel Online Real-time Classifier for Multi-label Data Streams [PDF]

A Novel Online Multi-label Classifier for High-Speed Streaming Data Applications [PDF

Code is available in github [Code]