Papers by Ludmila Kuncheva
IEEE Systems, Man, and Cybernetics Magazine, 2020
im Bezdek once told me, "Write just the same way you talk!" That is my excuse for the unashamedly... more im Bezdek once told me, "Write just the same way you talk!" That is my excuse for the unashamedly colloquial text to follow. Many, many years ago, sometime during the rock 'n' roll 1980s, when ladies wore shoulder pads and IBM 80-column punched cards were in high fashion, everybody in our research team had heard of Jim Bezdek. We were bewitched by fuzzy-pattern recognition, and Jim-the author of the famous book Pattern Recognition With Fuzzy Objective Function Algorithms [1]-was our hero, alongside Lotfi Zadeh. Years later, in 1993, I had the good fortune to attend one of Jim's plenary talks at a conference in Aachen, Germany. He walked in wearing the most colorful Hawaiian shirt, blue shorts, a baseball cap, and a smile brighter than Florida sunshine. His talk was magic. In 1996-1997, thanks to a generous grant from the National Science Foundation's Collaboration in Basic Science and Engineering (COBASE) program, I spent six months in Pensacola, Florida, working with him. I treasure that time as the most valuable and enlightening experience of my career.
The paper represents my personal views and should not be generalised lightly. More importantly, i... more The paper represents my personal views and should not be generalised lightly. More importantly, it should not be taken as the view of my School of Computer Science or that of Bangor University.
ArXiv, 2018
This paper draws a parallel between similarity-based categorisation models developed in cognitive... more This paper draws a parallel between similarity-based categorisation models developed in cognitive psychology and the nearest neighbour classifier (1-NN) in machine learning. Conceived as a result of the historical rivalry between prototype theories (abstraction) and exemplar theories (memorisation), recent models of human categorisation seek a compromise in-between. Regarding the stimuli (entities to be categorised) as points in a metric space, machine learning offers a large collection of methods to select a small, representative and discriminative point set. These methods are known under various names: instance selection, data editing, prototype selection, prototype generation or prototype replacement. The nearest neighbour classifier is used with the selected reference set. Such a set can be interpreted as a data-driven categorisation model. We juxtapose the models from the two fields to enable cross-referencing. We believe that both machine learning and cognitive psychology can ...
arXiv: Image and Video Processing, 2020
Segmentation of the liver from 3D computer tomography (CT) images is one of the most frequently p... more Segmentation of the liver from 3D computer tomography (CT) images is one of the most frequently performed operations in medical image analysis. In the past decade, Deep Learning Models (DMs) have offered significant improvements over previous methods for liver segmentation. The success of DMs is usually owed to the user's expertise in deep learning as well as to intricate training procedures. The need for bespoke expertise limits the reproducibility of empirical studies involving DMs. Today's consensus is that an ensemble of DMs works better than the individual component DMs. In this study we set off to explore the potential of ensembles of publicly available, `vanilla-style' DM segmenters Our ensembles were created from four off-the-shelf DMs: U-Net, Deepmedic, V-Net, and Dense V-Networks. To prevent further overfitting and to keep the overall model simple, we use basic non-trainable ensemble combiners: majority vote, average, product and min/max. Our results with two p...
Expert Systems with Applications, 2020
Over the past few decades, the remarkable prediction capabilities of ensemble methods have been u... more Over the past few decades, the remarkable prediction capabilities of ensemble methods have been used within a wide range of applications. Maximization of base-model ensemble accuracy and diversity are the keys to the heightened performance of these methods. One way to achieve diversity for training the base models is to generate artificial/synthetic instances for their incorporation with the original instances. Recently, the mixup method was proposed for improving the classification power of deep neural networks (Zhang et al., 2017). Mixup method generates artificial instances by combining pairs of instances and their labels, these new instances are used for training the neural networks promoting its regularization. In this paper, new regression tree ensembles trained with mixup, which we will refer to as Mixup Regression Forest, are presented and tested. The experimental study with 61 datasets showed that the mixup approach improved the results of both Random Forest and Rotation Forest.
Knowledge-Based Systems, 2019
Random Balance strategy (RandBal) has been recently proposed for constructing classifier ensemble... more Random Balance strategy (RandBal) has been recently proposed for constructing classifier ensembles for imbalanced, two-class data sets. In RandBal, each base classifier is trained with a sample of the data with a random class prevalence, independent of the a priori distribution. Hence, for each sample, one of the classes will be undersampled while the other will be oversampled. RandBal can be applied on its own or can be combined with any other ensemble method. One particularly successful variant is RandBalBoost which integrates Random Balance and boosting. Encouraged by the success of RandBal, this work proposes two approaches which extend RandBal to multiclass imbalance problems. Multiclass imbalance implies that at least two classes have substantially different proportion of instances. In the first approach proposed here, termed Multiple Random Balance (MultiRandBal), we deal with all classes simultaneously. The training data for each base classifier are sampled with random class proportions. The second approach we propose decomposes the multiclass problem into two-class problems using one-vs-one or one-vs-all, and builds an ensemble of RandBal ensembles. We call the two versions of the second approach OVO-RandBal and OVA-RandBal, respectively. These two approaches were chosen because they are the most straightforward extensions of RandBal for multiple classes. Our main objective is to evaluate both approaches for multiclass imbalanced problems. To this end, an experiment was carried out with 52 multiclass data sets. The results suggest that both MultiRandBal, and OVO/OVA-RandBal are viable extensions of the original two-class RandBal. Collectively, they consistently outperform acclaimed state-of-the art methods for multiclass imbalanced problems.
Pattern Recognition Letters, 2018
In the Restricted Set Classification approach (RSC), a set of instances must be labelled simultan... more In the Restricted Set Classification approach (RSC), a set of instances must be labelled simultaneously into a given number of classes, while observing an upper limit on the number of instances from each class. In this study we expand RSC by incorporating prior probabilities for the classes and demonstrate the improvement on the classification accuracy by doing so. As a case-study, we chose the challenging task of recognising the pieces on a chessboard from top-view images, without any previous knowledge of the game. This task fits elegantly into the RSC approach as the number of pieces on the board is limited, and each class (type of piece) may have only a fixed number of instances. We prepared an image dataset by sampling from existing competition games, arranging the pieces on the chessboard, and taking top-view snapshots. Using the grey-level intensities of each square as features, we applied single and ensemble classifiers within the RSC approach. Our results demonstrate that including prior probabilities calculated from existing chess games improves the RSC classification accuracy, which, in its own accord, is better than the accuracy of the classifier applied independently.
Pattern Recognition, 2018
High-dimensional data with very few instances are typical in many application domains. Selecting ... more High-dimensional data with very few instances are typical in many application domains. Selecting a highly discriminative subset of the original features is often the main interest of the end user. The widely-used feature selection protocol for such type of data consists of two steps. First, features are selected from the data (possibly through cross-validation), and, second, a cross-validation protocol is applied to test a classifier using the selected features. The selected feature set and the testing accuracy are then returned to the user. For the lack of a better option, the same low-sample-size dataset is used in both steps. Questioning the validity of this protocol, we carried out an experiment using 24 high-dimensional datasets, three feature selection methods and five classifier models. We found that the accuracy returned by the above protocol is heavily biased, and therefore propose an alternative protocol which avoids the contamination by including both steps in a single cross-validation loop. Statistical tests verify that the classification accuracy returned by the proper protocol is significantly closer to the true accuracy (estimated from an independent testing set) compared to that returned by the currently favoured protocol.
Neurocomputing, 2018
Large numbers of data streams are today generated in many fields. A key challenge when learning f... more Large numbers of data streams are today generated in many fields. A key challenge when learning from such streams is the problem of concept drift. Many methods, including many prototype methods, have been proposed in recent years to address this problem. This paper presents a refined taxonomy of instance selection and generation methods for the classification of data streams subject to concept drift. The taxonomy allows discrimination among a large number of methods which pre-existing taxonomies for offline instance selection methods did not distinguish. This makes possible a valuable new perspective on experimental results, and provides a framework for discussion of the concepts behind different algorithm-design approaches. We review a selection of modern algorithms for the purpose of illustrating the distinctions made by the taxonomy. We present the results of a numerical experiment which examined the performance of a number of representative methods on both synthetic and real-world data sets with and without concept drift, and discuss the implications for the directions of future research in light of the taxonomy. On the basis of the experimental results, we are able to give recommendations for the experimental evaluation of algorithms which may be proposed in the future.
Progress in Artificial Intelligence, 2019
A natural way of handling imbalanced data is to attempt to equalise the class frequencies and tra... more A natural way of handling imbalanced data is to attempt to equalise the class frequencies and train the classifier of choice on balanced data. For two-class imbalanced problems, the classification success is typically measured by the geometric mean (GM) of the true positive and true negative rates. Here we prove that GM can be improved upon by instance selection, and give the theoretical conditions for such an improvement. We demonstrate that GM is non-monotonic with respect to the number of retained instances, which discourages systematic instance selection. We also show that balancing the distribution frequencies is inferior to a direct maximisation of GM. To verify our theoretical findings, we carried out an experimental study of 12 instance selection methods for imbalanced data, using 66 standard benchmark data sets. The results reveal possible room for new instance selection methods for imbalanced data.
Information Fusion, 2019
Detecting change in multivariate data is a challenging problem, especially when class labels are ... more Detecting change in multivariate data is a challenging problem, especially when class labels are not available. There is a large body of research on univariate change detection, notably in control charts developed originally for engineering applications. We evaluate univariate change detection approaches-including those in the MOA framework-built into ensembles where each member observes a feature in the input space of an unsupervised change detection problem. We present a comparison between the ensemble combinations and three established 'pure' multivariate approaches over 96 data sets, and a case study on the KDD Cup 1999 network intrusion detection dataset. We found that ensemble combination of univariate methods consistently outperformed multivariate methods on the four experimental metrics.
Pattern Recognition, 2017
We consider a problem where a set X of N objects (instances) coming from c classes have to be cla... more We consider a problem where a set X of N objects (instances) coming from c classes have to be classified simultaneously. A restriction is imposed on X in that the maximum possible number of objects from each class is known, hence we dubbed the problem who-is-there? We compare three approaches to this problem: (1) Independent classification whereby each object is labelled in the class with the largest posterior probability; (2) A greedy approach which enforces the restriction; and (3) A theoretical approach which, in addition, maximises the likelihood of the label assignment, implemented through the Hungarian assignment algorithm. Our experimental study consists of two parts. The first part includes a custom-made chess data set where the pieces on the chess board must be recognised together from an image of the board. In the second part, we simulate the restricted set classification scenario using 96 datasets from a recently collated repository (University of Santiago de Compostela, USC). Our results show that the proposed approach (3) outperforms approaches (1) and (2).
2016 International Joint Conference on Neural Networks (IJCNN), 2016
This study brings together systematised views of two related areas: data editing for the nearest ... more This study brings together systematised views of two related areas: data editing for the nearest neighbour classifier and adaptive learning in the presence of concept drift. The growing number of studies in the intersection of these areas warrants a closer look. We revise and update the taxonomies of the two areas proposed in the literature and argue that they are not sufficiently discriminative with respect to methods for prototype selection and prototype generation in the presence of concept drift. We proceed to create a bespoke taxonomy of these methods and illustrate it with ten examples from the literature. The new taxonomy can serve as a road-map for researching the intersection area and inform the development of new methods.
Pattern Recognition, 2001
Classifier combination is now an established pattern recognition subdiscipline. Despite the stron... more Classifier combination is now an established pattern recognition subdiscipline. Despite the strong aspiration for theoretical studies, classifier combination relies mainly on heuristic and empirical solutions. Assuming that "soft computing" encompasses neural networks, evolutionary computation, and fuzzy sets, we explain how each of the three components has been used in classifier combination.
Lecture Notes in Computer Science
Cluster ensembles are deemed to be better than single clustering algorithms for discovering compl... more Cluster ensembles are deemed to be better than single clustering algorithms for discovering complex or noisy structures in data. Various heuristics for constructing such ensembles have been examined in the literature, e.g., random feature selection, weak clusterers, random projections, etc. Typically, one heuristic is picked at a time to construct the ensemble. To increase diversity of the ensemble, several heuristics may be applied together. However, not any combination may be beneficial. Here we apply a standard genetic algorithm (GA) to select from 7 standard heuristics for k-means cluster ensembles. The ensemble size is also encoded in the chromosome. In this way the data is forced to guide the selection of heuristics as well as the ensemble size. Eighteen moderate-size datasets were used: 4 artificial and 14 real. The results resonate with our previous findings in that high diversity is not necessarily a prerequisite for high accuracy of the ensemble. No particular combination of heuristics appeared to be consistently chosen across all datasets, which justifies the existing variety of cluster ensembles. Among the most often selected heuristics were random feature extraction, random feature selection and random number of clusters assigned for each ensemble member. Based on the experiments, we recommend that the current practice of using one or two heuristics for building k-means cluster ensembles should be revised in favour of using 3-5 heuristics.
Lecture Notes in Computer Science, 2010
Although diversity in classifier ensembles is desirable, its relationship with the ensemble accur... more Although diversity in classifier ensembles is desirable, its relationship with the ensemble accuracy is not straightforward. Here we derive a decomposition of the majority vote error into three terms: average individual accuracy, "good" diversity and "bad diversity". The good diversity term is taken out of the individual error whereas the bad diversity term is added to it. We relate the two diversity terms to the majority vote limits defined previously (the patterns of success and failure). A simulation study demonstrates how the proposed decomposition can be used to gain insights about majority vote classifier ensembles.
GSTF Journal on Computing (JoC), 2014
Affective gaming (AG) is a cross-disciplinary area drawing upon psychology, physiology, electroni... more Affective gaming (AG) is a cross-disciplinary area drawing upon psychology, physiology, electronic engineering and computer science, among others. This paper presents a historical overview of affective gaming, bringing together psychophysiological system developments, a time-line of video game graphical advancements and industry trends, thereby offering an entry point into affective gaming research. It is proposed that video games may soon reach a peak in perceivable graphical improvements. This opens up the door for innovative game enhancement strategies, such as emotiondriven interactions between the player and the gaming environment.
Proceedings of 13th International Conference on Pattern Recognition, 1996
In this paper we consider a radial basis function (RBF) network with tunable shape and spreadout ... more In this paper we consider a radial basis function (RBF) network with tunable shape and spreadout parameters of the activation function. We argue that fewer hidden nodes with different RBF shapes can better match the classification regions still preserving the context of the probabilistic semiparametric approximation of the conditional probability density functions (pdf). Instead of squared Euclidean norm (L2 norm)
Cluster ensembles are deemed to be better than single clustering algorithms for discovering compl... more Cluster ensembles are deemed to be better than single clustering algorithms for discovering complex or noisy structures in data. We consider difierent heuristics to introduce diversity in cluster ensembles and study their individual and combined efiect on the ensemble accuracy. Our experiments with three artiflcial and three real data sets, and 12 ensemble types, showed that the most successful diversifying
2010 IEEE International Conference on Information Theory and Information Security, 2010
Design patterns are general reusable solutions to a commonly occurring problem in software design... more Design patterns are general reusable solutions to a commonly occurring problem in software design. One of the important patterns is Strategy Pattern. The strategy pattern is intended to provide a means to define a family of algorithms, encapsulate each one as an object and make them interchangeable. In this paper, a solution is proposed to improve security in application programs by using the Strategy Pattern. In the solution, one can choose the algorithms dynamically based on security requirements and computing time constraints.
Uploads
Papers by Ludmila Kuncheva