Skip to main content

Thomas Demeester

Ghent University, Information Technology (INTEC), Post-Doc

Followers

13

Following

5

Co-authors

5

Public Views

Sofie Van Hoecke

Ghent University

École Normale Supérieure de Lyon

Przemysław Rokita

Warsaw University of Technology

Mosabber Uddin Ahmed

Alfredo Perales

Universidad Publica de valencia

Markos Tsipouras

Liverpool John Moores University

Chelsea Dobbins

The University of Queensland, Australia

Dhiya Al-jumeily

Liverpool John Moores University

University of Liverpool

Interests

Uploads

Papers by Thomas Demeester

Overly optimistic prediction results on imbalanced data: a case study of flaws and benefits when applying over-sampling

Artificial Intelligence in Medicine, 2021

Information extracted from electrohysterography recordings could potentially prove to be an inter... more Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.

Exploration of block-wise dynamic sparseness

Pattern Recognition Letters

A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The B2 Level and the Dream of a Common Standard

Language Assessment Quarterly

System Identification with Time-Aware Neural Sequence Models

Proceedings of the AAAI Conference on Artificial Intelligence

Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks... more Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks involving discrete sequences. However, they do not perform as well in the task of dynamical system identification, when dealing with observations from continuous variables that are unevenly sampled in time, for example due to missing observations. We show how such neural sequence models can be adapted to deal with variable step sizes in a natural way. In particular, we introduce a ‘time-aware’ and stationary extension of existing models (including the Gated Recurrent Unit) that allows them to deal with unevenly sampled system observations by adapting to the observation times, while facilitating higher-order temporal behavior. We discuss the properties and demonstrate the validity of the proposed approach, based on samples from two industrial input/output processes.

Predicting Psychological Health from Childhood Essays. The UGent-IDLab CLPsych 2018 Shared Task System

Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic

This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The go... more This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The goal of this task is predicting psychological health of children based on language used in handwritten essays and sociodemographic control variables. Our entry uses word-and character-based features as well as lexicon-based features and features derived from the essays such as the quality of the language. We apply linear models, gradient boosting as well as neural-network based regressors (feed-forward, CNNs and RNNs) to predict scores. We then make ensembles of our best performing models using a weighted average.

A Self-Training Approach for Short Text Clustering

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF r... more Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations for short texts. Lowdimensional continuous representations or embeddings can counter that sparseness problem: their high representational power is exploited in deep clustering algorithms. While deep clustering has been studied extensively in computer vision, relatively little work has focused on NLP. The method we propose, learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network. Experiments on three short text datasets empirically validate the effectiveness of our method.

Sub-event detection from twitter streams as a sequence labeling problem

Proceedings of the 2019 Conference of the North

This paper introduces improved methods for sub-event detection in social media streams, by applyi... more This paper introduces improved methods for sub-event detection in social media streams, by applying neural sequence models not only on the level of individual posts, but also directly on the stream level. Current approaches to identify sub-events within a given event, such as a goal during a soccer match, essentially do not exploit the sequential nature of social media streams. We address this shortcoming by framing the sub-event detection problem in social media streams as a sequence labeling task and adopt a neural sequence architecture that explicitly accounts for the chronological order of posts. Specifically, we (i) establish a neural baseline that outperforms a graph-based state-of-the-art method for binary sub-event detection (2.7% micro-F 1 improvement), as well as (ii) demonstrate superiority of a recurrent neural network model on the posts sequence level for labeled sub-events (2.4% bin-level F 1 improvement over non-sequential models).

Adversarial training for multi-context joint entity and relation extraction

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Adversarial training (AT) is a regularization method that can be used to improve the robustness o... more Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improving the state-of-the-art effectiveness on several datasets in different contexts (i.e., news, biomedical, and real estate data) and for different languages (English and Dutch).

Jack the Reader – A Machine Reading Framework

Proceedings of ACL 2018, System Demonstrations

Many Machine Reading and Natural Language Understanding tasks require reading supporting text in ... more Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of related tasks would allow for expressive modelling, and easier model comparison and replication. To that end, we present Jack the Reader (JACK), a framework for Machine Reading that allows for quick model prototyping by component reuse, evaluation of new models on existing datasets as well as integrating new datasets and applying them on a growing set of implemented baseline models. JACK is currently supporting (but not limited to) three tasks: Question Answering, Natural Language Inference, and Link Prediction. It is developed with the aim of increasing research efficiency and code reuse.

Joint entity recognition and relation extraction as a multi-head selection problem

Expert Systems with Applications

State-of-the-art models for joint entity recognition and relation extraction strongly rely on ext... more State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

In this paper, we address the (to the best of our knowledge) new problem of extracting a structur... more In this paper, we address the (to the best of our knowledge) new problem of extracting a structured description of real estate properties from their natural language descriptions in classifieds. We survey and present several models to (a) identify important entities of a property (e.g., rooms) from classifieds and (b) structure them into a tree format, with the entities as nodes and edges representing a part-of relation. Experiments show that a graphbased system deriving the tree from an initially fully connected entity graph, outperforms a transition-based system starting from only the entity nodes, since it better reconstructs the tree.

UGent Participation in the Microblog Track 2012

Modeling the Broadband Inductive and Resistive Behavior of Composite Conductors

Microwave and Wireless Components Letters Ieee, Apr 1, 2008

Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.

An Automated End-To-End Pipeline for Fine-Grained Video Annotation using Deep Neural Networks

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval - ICMR '16, 2016

Mirex and Taily at TREC 2013

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13, 2013

ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for e... more ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for each query. State-of-the-art shard selection algorithms first query a central index of sampled documents, and their effectiveness is similar to searching all shards. However, the search in the central index also hurts efficiency. Additionally, we show that the effectiveness of these approaches varies substantially with the sampled documents. This paper proposes Taily, a novel shard selection algorithm that models a query&#39;s score distribution in each shard as a Gamma distribution and selects shards with highly scored documents in the tail of the distribution. Taily estimates the parameters of score distributions based on the mean and variance of the score function&#39;s features in the collections and shards. Because Taily operates on term statistics instead of document samples, it is efficient and has deterministic effectiveness. Experiments on large web collections (Gov2, CluewebA and CluewebB) show that Taily achieves similar effectiveness to sample-based approaches, and improves upon their efficiency by roughly 20% in terms of used resources and response time.

Modeling the broadband resistive and inductive behavior of polygonal conductors

2009 International Conference on Electromagnetics in Advanced Applications, 2009

ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary ... more ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary operator for a convex polygonal conductor. The technique is based on an expansion of the boundary value of the current density. Because the corresponding expansion functions exhibit the exact current behavior inside the conductor, they ensure a very good accuracy up to skin effect frequencies. In combination with a classical boundary integral method and the Method of Moments, the Dirichlet-to-Neumann technique allows for a direct determination of the resistive and inductive properties of transmission line configurations constructed from these conductors, as is illustrated with some numerical examples.

Quasi-TM Transmission Line Parameters of Coupled Lossy Lines Based on the Dirichlet to Neumann Boundary Operator

IEEE Transactions on Microwave Theory and Techniques, 2000

This paper presents a new multiconductor transmission line model for general 2-D lossy configurat... more This paper presents a new multiconductor transmission line model for general 2-D lossy configurations based on mode reciprocity. Particular attention is devoted to elucidate the validity of the quasi-TM model and the approximations that have to be invoked to obtain this model. A new derivation of the complex capacitance matrix is given, especially taking into account the presence of semiconductors. This derivation automatically leads to a nonclassical circuit signal current definition and demands for a formulation of the complex inductance problem consistent with that definition. The relevant resistance, inductance, conductance, and capacitance circuit matrices are obtained by solving boundary integral equations only, making use of the Dirichlet to Neumann boundary operator for the different materials. This allows to simulate complex metal-insulator-semiconductor structures, as shown in the numerical examples.

Modeling the Broadband Inductive and Resistive Behavior of Composite Conductors

IEEE Microwave and Wireless Components Letters, 2000

Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.

Overly optimistic prediction results on imbalanced data: a case study of flaws and benefits when applying over-sampling

Artificial Intelligence in Medicine, 2021

Information extracted from electrohysterography recordings could potentially prove to be an inter... more Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.

Exploration of block-wise dynamic sparseness

Pattern Recognition Letters

A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The B2 Level and the Dream of a Common Standard

Language Assessment Quarterly

System Identification with Time-Aware Neural Sequence Models

Proceedings of the AAAI Conference on Artificial Intelligence

Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks... more Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks involving discrete sequences. However, they do not perform as well in the task of dynamical system identification, when dealing with observations from continuous variables that are unevenly sampled in time, for example due to missing observations. We show how such neural sequence models can be adapted to deal with variable step sizes in a natural way. In particular, we introduce a ‘time-aware’ and stationary extension of existing models (including the Gated Recurrent Unit) that allows them to deal with unevenly sampled system observations by adapting to the observation times, while facilitating higher-order temporal behavior. We discuss the properties and demonstrate the validity of the proposed approach, based on samples from two industrial input/output processes.

Predicting Psychological Health from Childhood Essays. The UGent-IDLab CLPsych 2018 Shared Task System

Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic

This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The go... more This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The goal of this task is predicting psychological health of children based on language used in handwritten essays and sociodemographic control variables. Our entry uses word-and character-based features as well as lexicon-based features and features derived from the essays such as the quality of the language. We apply linear models, gradient boosting as well as neural-network based regressors (feed-forward, CNNs and RNNs) to predict scores. We then make ensembles of our best performing models using a weighted average.

A Self-Training Approach for Short Text Clustering

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF r... more Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations for short texts. Lowdimensional continuous representations or embeddings can counter that sparseness problem: their high representational power is exploited in deep clustering algorithms. While deep clustering has been studied extensively in computer vision, relatively little work has focused on NLP. The method we propose, learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network. Experiments on three short text datasets empirically validate the effectiveness of our method.

Sub-event detection from twitter streams as a sequence labeling problem

Proceedings of the 2019 Conference of the North

This paper introduces improved methods for sub-event detection in social media streams, by applyi... more This paper introduces improved methods for sub-event detection in social media streams, by applying neural sequence models not only on the level of individual posts, but also directly on the stream level. Current approaches to identify sub-events within a given event, such as a goal during a soccer match, essentially do not exploit the sequential nature of social media streams. We address this shortcoming by framing the sub-event detection problem in social media streams as a sequence labeling task and adopt a neural sequence architecture that explicitly accounts for the chronological order of posts. Specifically, we (i) establish a neural baseline that outperforms a graph-based state-of-the-art method for binary sub-event detection (2.7% micro-F 1 improvement), as well as (ii) demonstrate superiority of a recurrent neural network model on the posts sequence level for labeled sub-events (2.4% bin-level F 1 improvement over non-sequential models).

Adversarial training for multi-context joint entity and relation extraction

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Adversarial training (AT) is a regularization method that can be used to improve the robustness o... more Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improving the state-of-the-art effectiveness on several datasets in different contexts (i.e., news, biomedical, and real estate data) and for different languages (English and Dutch).

Jack the Reader – A Machine Reading Framework

Proceedings of ACL 2018, System Demonstrations

Many Machine Reading and Natural Language Understanding tasks require reading supporting text in ... more Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of related tasks would allow for expressive modelling, and easier model comparison and replication. To that end, we present Jack the Reader (JACK), a framework for Machine Reading that allows for quick model prototyping by component reuse, evaluation of new models on existing datasets as well as integrating new datasets and applying them on a growing set of implemented baseline models. JACK is currently supporting (but not limited to) three tasks: Question Answering, Natural Language Inference, and Link Prediction. It is developed with the aim of increasing research efficiency and code reuse.

Joint entity recognition and relation extraction as a multi-head selection problem

Expert Systems with Applications

State-of-the-art models for joint entity recognition and relation extraction strongly rely on ext... more State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.

Reconstructing the house from the ad: Structured prediction on real estate classifieds

Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers

In this paper, we address the (to the best of our knowledge) new problem of extracting a structur... more In this paper, we address the (to the best of our knowledge) new problem of extracting a structured description of real estate properties from their natural language descriptions in classifieds. We survey and present several models to (a) identify important entities of a property (e.g., rooms) from classifieds and (b) structure them into a tree format, with the entities as nodes and edges representing a part-of relation. Experiments show that a graphbased system deriving the tree from an initially fully connected entity graph, outperforms a transition-based system starting from only the entity nodes, since it better reconstructs the tree.

UGent Participation in the Microblog Track 2012

Modeling the Broadband Inductive and Resistive Behavior of Composite Conductors

Microwave and Wireless Components Letters Ieee, Apr 1, 2008

Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.

An Automated End-To-End Pipeline for Fine-Grained Video Annotation using Deep Neural Networks

Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval - ICMR '16, 2016

Mirex and Taily at TREC 2013

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13, 2013

ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for e... more ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for each query. State-of-the-art shard selection algorithms first query a central index of sampled documents, and their effectiveness is similar to searching all shards. However, the search in the central index also hurts efficiency. Additionally, we show that the effectiveness of these approaches varies substantially with the sampled documents. This paper proposes Taily, a novel shard selection algorithm that models a query&#39;s score distribution in each shard as a Gamma distribution and selects shards with highly scored documents in the tail of the distribution. Taily estimates the parameters of score distributions based on the mean and variance of the score function&#39;s features in the collections and shards. Because Taily operates on term statistics instead of document samples, it is efficient and has deterministic effectiveness. Experiments on large web collections (Gov2, CluewebA and CluewebB) show that Taily achieves similar effectiveness to sample-based approaches, and improves upon their efficiency by roughly 20% in terms of used resources and response time.

Modeling the broadband resistive and inductive behavior of polygonal conductors

2009 International Conference on Electromagnetics in Advanced Applications, 2009

ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary ... more ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary operator for a convex polygonal conductor. The technique is based on an expansion of the boundary value of the current density. Because the corresponding expansion functions exhibit the exact current behavior inside the conductor, they ensure a very good accuracy up to skin effect frequencies. In combination with a classical boundary integral method and the Method of Moments, the Dirichlet-to-Neumann technique allows for a direct determination of the resistive and inductive properties of transmission line configurations constructed from these conductors, as is illustrated with some numerical examples.

Quasi-TM Transmission Line Parameters of Coupled Lossy Lines Based on the Dirichlet to Neumann Boundary Operator

IEEE Transactions on Microwave Theory and Techniques, 2000

This paper presents a new multiconductor transmission line model for general 2-D lossy configurat... more This paper presents a new multiconductor transmission line model for general 2-D lossy configurations based on mode reciprocity. Particular attention is devoted to elucidate the validity of the quasi-TM model and the approximations that have to be invoked to obtain this model. A new derivation of the complex capacitance matrix is given, especially taking into account the presence of semiconductors. This derivation automatically leads to a nonclassical circuit signal current definition and demands for a formulation of the complex inductance problem consistent with that definition. The relevant resistance, inductance, conductance, and capacitance circuit matrices are obtained by solving boundary integral equations only, making use of the Dirichlet to Neumann boundary operator for the different materials. This allows to simulate complex metal-insulator-semiconductor structures, as shown in the numerical examples.

Modeling the Broadband Inductive and Resistive Behavior of Composite Conductors

IEEE Microwave and Wireless Components Letters, 2000

Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.