Classification is a data mining task. It aims to extract knowledge from large datasets. There are two kinds of classification. The first one is known as complete classification, and it is applied to balanced datasets. However, when it is applied to imbalanced ones, it is called partial classification or a problem of classification in imbalanced datasets, which is a fundamental problem in machine learning, and it has received much attention. Considering the importance of this issue, a large amount of techniques have been proposed trying to address this problem. These proposals can be divided into three levels: the algorithm level, the data level, and the hybrid level. In this chapter, we will present the classification problem in imbalanced datasets, its domains of application, its appropriate measures of performances, and its approaches and techniques.
Part of the book: Recent Trends in Computational Intelligence
Multi-label datasets contain several classes, where each class can have multiple values. They appear in several domains such as music categorization into emotions and directed marketing. In this chapter, we are interested in the most popular task of Data Mining, which is the classification, more precisely classification in multi-label datasets. To do this, we will present the different methods used to extract knowledge from these datasets. These methods are divided into two categories: problem transformation methods and algorithm adaptation ones. The methods of the first category transform multi-label classification problem into one or more single classification problems. While the methods of the second category extend a specific learning algorithm, in order to handle multi-label datasets directly. Also, we will present the different evaluation measures used to evaluate the quality of extracted knowledge.
Part of the book: Information Systems Management