Papers by Miquel Perelló Nieto
arXiv (Cornell University), Feb 6, 2023
In supervised learning, low quality annotations lead to poorly performing classification and dete... more In supervised learning, low quality annotations lead to poorly performing classification and detection models, while also rendering evaluation unreliable. This is particularly apparent on temporal data, where annotation quality is affected by multiple factors. For example, in the post-hoc self-reporting of daily activities, cognitive biases are one of the most common ingredients. In particular, reporting the start and duration of an activity after its finalisation may incorporate biases introduced by personal time perceptions, as well as the imprecision and lack of granularity due to time rounding. Here we propose a method to model human biases on temporal annotations and argue for the use of soft labels. Experimental results in synthetic data show that soft labels provide a better approximation of the ground truth for several metrics. We showcase the method on a real dataset of daily activities.
In this review we explore the uncertainty of common supervised classification models. We explains... more In this review we explore the uncertainty of common supervised classification models. We explains some of their prior assumptions, how these assumptions bias their predicted probabilities, and how to interpret their confidence values in different situations. We also describe our proposed method to make common classifiers more reliable and versatile, and how these can be used in fluctuating scenarios in which unexpected classes and anomalies may appear during deployment. Furthermore, we show an extension of proper loss functions that allow classifiers; that minimize an empirical loss; to be trained with weak labels (labels that may be wrong). Finally, we discuss two future directions of our current work: (1) how to get better probability estimates in Deep Neural Networks, and (2) new methods to reuse old datasets whose labels may be outdated and weak. keywords: Supervised learning, Semi-supervised learning, classifier calibration, classification with confidence, cautious classificati...
2023 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)
Machine Learning
This paper provides both an introduction to and a detailed overview of the principles and practic... more This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of pos...
BACKGROUND The current assessment of recovery after total hip or knee replacement is largely base... more BACKGROUND The current assessment of recovery after total hip or knee replacement is largely based on the measurement of health outcomes through self-report and clinical observations at follow-up appointments in clinical settings. Home activity-based monitoring may improve assessment of recovery by enabling the collection of more holistic information on a continuous basis. OBJECTIVE This study aimed to introduce orthopedic surgeons to time-series analyses of patient activity data generated from a platform of sensors deployed in the homes of patients who have undergone primary total hip or knee replacement and understand the potential role of these data in postoperative clinical decision-making. METHODS Orthopedic surgeons and registrars were recruited through a combination of convenience and snowball sampling. Inclusion criteria were a minimum required experience in total joint replacement surgery specific to the hip or knee or familiarity with postoperative recovery assessment. Exc...
Scientific Data
SPHERE is a large multidisciplinary project to research and develop a sensor network to facilitat... more SPHERE is a large multidisciplinary project to research and develop a sensor network to facilitate home healthcare by activity monitoring, specifically towards activities of daily living. It aims to use the latest technologies in low powered sensors, internet of things, machine learning and automated decision making to provide benefits to patients and clinicians. This dataset comprises data collected from a SPHERE sensor network deployment during a set of experiments conducted in the ‘SPHERE House’ in Bristol, UK, during 2016, including video tracking, accelerometer and environmental sensor data obtained by volunteers undertaking both scripted and non-scripted activities of daily living in a domestic residence. Trained annotators provided ground-truth labels annotating posture, ambulation, activity and location. This dataset is a valuable resource both within and outside the machine learning community, particularly in developing and evaluating algorithms for identifying activities o...
Third International Conference on Computer Science and Communication Technology (ICCSCT 2022)
A smart home equipped with a diversity of multimodal sensors is a meaningful setting for acquirin... more A smart home equipped with a diversity of multimodal sensors is a meaningful setting for acquiring the health status of its residents and improving their well-being. In recent years, sensor-based activity recognition has received growing research attention. However, the multi-modal nature of these sensor platforms raises great challenges with respect to the data fusion of the different sensor sources. To solve this problem, we present an activity recognition approach incorporating attention mechanism in this paper. A Convolutional Neural Network-based training framework is developed to extract representative features for activities. Specifically, we design two attention modules-channel-wise and temporal-wise modules to capture the interdependencies between channel and temporal dimensions of its convolutional features. We evaluate the attention-based approach on a real activity recognition challenge dataset. Experiments justify that the attention network-based feature fusion can effectively improve the activity recognition performance.
arXiv (Cornell University), Aug 7, 2019
This paper describes HyperStream, a large-scale, flexible and robust software package, written in... more This paper describes HyperStream, a large-scale, flexible and robust software package, written in the Python language, for processing streaming data with workflow creation capabilities. HyperStream overcomes the limitations of other computational engines and provides high-level interfaces to execute complex nesting, fusion, and prediction both in online and offline forms in streaming environments. HyperStream is a general purpose tool that is well-suited for the design, development, and deployment of Machine Learning algorithms and predictive models in a wide space of sequential predictive problems. Source code, installation instructions, examples, and documentation can be found at: https://github.com/IRC-SPHERE/HyperStream.
arXiv (Cornell University), Dec 19, 2021
This paper provides both an introduction to and a detailed overview of the principles and practic... more This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main 1
2021 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)
This paper looks to explore the challenges faced when producing a set of annotations from videos ... more This paper looks to explore the challenges faced when producing a set of annotations from videos produced by a pilot study evaluating 24 participants (12 with Parkinson's disease, each accompanied by a healthy volunteer control participant) who are free-living in a house embedded with a platform of sensors. We discuss the outcome measures chosen to annotate from the videos and the controlled vocabularies formulated for this task, the tools and processes, how we intend to achieve standardisation and normalisation of the annotations, and how to improve quality and re-usability of the annotation dataset.
2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops)
Many ways of annotating a dataset for machine learning classification tasks that go beyond the us... more Many ways of annotating a dataset for machine learning classification tasks that go beyond the usual class labels exist in practice. These are of interest as they can simplify or facilitate the collection of annotations, while not greatly affecting the resulting machine learning model. Many of these fall under the umbrella term of weak labels or annotations. However, it is not always clear how different alternatives are related. In this paper we propose a framework for categorising weak supervision settings with the aim of: (1) helping the dataset owner or annotator navigate through the available options within weak supervision when prescribing an annotation process, and (2) describing existing annotations for a dataset to machine learning practitioners so that we allow them to understand the implications for the learning process. To this end, we identify the key elements that characterise weak supervision and devise a series of dimensions that categorise most of the existing approaches. We show how common settings in the literature fit within the framework and discuss its possible uses in practice.
Journal of Ambient Intelligence and Humanized Computing, 2022
Human activity recognition (HAR), which aims at inferring the behavioral patterns of people, is a... more Human activity recognition (HAR), which aims at inferring the behavioral patterns of people, is a fundamental research problem in digital health and ambient intelligence. The application of machine learning methods in HAR has been investigated vigorously in recent years. However, there are still a number of challenges confronting the task, where one significant barrier lies in the longstanding shortage of annotations. To address this issue, we establish a new paradigm for HAR, which integrates active learning and semi-supervised learning into one framework. The main idea is to reduce the annotation cost by actively selecting the most informative samples for annotation, as well as leveraging the unlabelled instances in a semi-supervised way. In particular, we propose to utilize the massive unlabelled data via temporal ensembling of convolutional neural networks (CNN), which yields robust consensus predictions by aggregating the outputs of the training networks on different epochs. We...
This is the first pre-release of PyCalib, a library to calibrate probabilistic classifiers. The c... more This is the first pre-release of PyCalib, a library to calibrate probabilistic classifiers. The current contributors of the code are @perellonieto @Srceh @tmfilho @markus93
Advances in Intelligent Data Analysis XVI, 2017
In many real-world problems, labels are often weak, meaning that each instance is labelled as bel... more In many real-world problems, labels are often weak, meaning that each instance is labelled as belonging to one of several candidate categories, at most one of them being true. Recent theoretical contributions have shown that it is possible to construct proper losses or classification calibrated losses for weakly labelled classification scenarios by means of a linear transformation of conventional proper or classification calibrated losses, respectively. However, how to translate these theoretical results into practice has not been explored yet. This paper discusses both the algorithmic design and the potential advantages of this approach, analyzing consistency and convexity issues arising in practical settings, and evaluating the behavior of such transformations under different types of weak labels.
2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), 2020
In hardware deployments, it is often necessary to test platforms for suitability for particular p... more In hardware deployments, it is often necessary to test platforms for suitability for particular purposes. As lengthy data collection processes often outlive specific iterations of hardware and firmware, it is likely that migration between platforms may become necessary. In this short paper we describe a practical approach employed for acceptance testing, comparison and validation of two iterations of a wearable accelerometer and localisation platform, based on an annotated 15-task activity script. We present an analysis on the data generated for the different activities, and compare device performance using common machine learning algorithms for activity recognition.
The UK health service sees around 160,000 total hip or knee replacements every year and this numb... more The UK health service sees around 160,000 total hip or knee replacements every year and this number is expected to rise. Expectations of surgical outcome are changing alongside demographic trends, whilst aftercare may be fractured as a result of resource limitation or other factors. Conventional assessments of health outcomes must evolve to keep up with these changing trends. In practice, patients may visit a health care professional to discuss recovery and will provide survey feedback to clinicians using standardised instruments, such as the Oxford Hip & Knee score, in the months following surgery. To aid clinicians in providing accurate assessment of patient recovery a continuous home health care monitoring system would be beneficial. In this paper the authors explore how the SPHERE sensor network can be used to automatically generate measures of recovery from arthroplasty to facilitate continuous monitoring of behaviour, including location, room transitions, movement and activity...
A dataset containing data from the Wearable 3 sensor developed by SPHERE. This dataset contains t... more A dataset containing data from the Wearable 3 sensor developed by SPHERE. This dataset contains two sessions in which participants complete a simple set of activities of daily living, wearing multiple wearables (1-2 wearables on each wrist). It uses a methodology described in a recent paper, "Towards a Methodology for Acceptance Testing and Validation of Monitoring Bodyworn Devices"
Aquesta memoria demostra la possibilitat d'utilitzar tecniques d'aprenentatge multi-modal... more Aquesta memoria demostra la possibilitat d'utilitzar tecniques d'aprenentatge multi-modal en xarxes neuronals convolucionals per classificar imatges. Concretament, mostrem que es possible aprendre filtres separats per la luminancia i la crominancia sense perdre qualitat en les prediccions.
Uploads
Papers by Miquel Perelló Nieto