Papers by Thomas Demeester
Artificial Intelligence in Medicine, 2021
Information extracted from electrohysterography recordings could potentially prove to be an inter... more Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram database. However, we argue that these results are overly optimistic due to a methodological flaw being made. In this work, we focus on one specific type of methodological flaw: applying over-sampling before partitioning the data into mutually exclusive training and testing sets. We show how this causes the results to be biased using two artificial datasets and reproduce results of studies in which this flaw was identified. Moreover, we evaluate the actual impact of over-sampling on predictive performance, when applied prior to data partitioning, using the same methodologies of related studies, to provide a realistic view of these methodologies' generalization capabilities. We make our research reproducible by providing all the code under an open license.
Pattern Recognition Letters
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Language Assessment Quarterly
Proceedings of the AAAI Conference on Artificial Intelligence
Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks... more Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks involving discrete sequences. However, they do not perform as well in the task of dynamical system identification, when dealing with observations from continuous variables that are unevenly sampled in time, for example due to missing observations. We show how such neural sequence models can be adapted to deal with variable step sizes in a natural way. In particular, we introduce a ‘time-aware’ and stationary extension of existing models (including the Gated Recurrent Unit) that allows them to deal with unevenly sampled system observations by adapting to the observation times, while facilitating higher-order temporal behavior. We discuss the properties and demonstrate the validity of the proposed approach, based on samples from two industrial input/output processes.
Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic
This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The go... more This paper describes the IDLab system submitted to Task A of the CLPsych 2018 shared task. The goal of this task is predicting psychological health of children based on language used in handwritten essays and sociodemographic control variables. Our entry uses word-and character-based features as well as lexicon-based features and features derived from the essays such as the quality of the language. We apply linear models, gradient boosting as well as neural-network based regressors (feed-forward, CNNs and RNNs) to predict scores. We then make ensembles of our best performing models using a weighted average.
Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF r... more Short text clustering is a challenging problem when adopting traditional bag-of-words or TF-IDF representations, since these lead to sparse vector representations for short texts. Lowdimensional continuous representations or embeddings can counter that sparseness problem: their high representational power is exploited in deep clustering algorithms. While deep clustering has been studied extensively in computer vision, relatively little work has focused on NLP. The method we propose, learns discriminative features from both an autoencoder and a sentence embedding, then uses assignments from a clustering algorithm as supervision to update weights of the encoder network. Experiments on three short text datasets empirically validate the effectiveness of our method.
Proceedings of the 2019 Conference of the North
This paper introduces improved methods for sub-event detection in social media streams, by applyi... more This paper introduces improved methods for sub-event detection in social media streams, by applying neural sequence models not only on the level of individual posts, but also directly on the stream level. Current approaches to identify sub-events within a given event, such as a goal during a soccer match, essentially do not exploit the sequential nature of social media streams. We address this shortcoming by framing the sub-event detection problem in social media streams as a sequence labeling task and adopt a neural sequence architecture that explicitly accounts for the chronological order of posts. Specifically, we (i) establish a neural baseline that outperforms a graph-based state-of-the-art method for binary sub-event detection (2.7% micro-F 1 improvement), as well as (ii) demonstrate superiority of a recurrent neural network model on the posts sequence level for labeled sub-events (2.4% bin-level F 1 improvement over non-sequential models).
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Adversarial training (AT) is a regularization method that can be used to improve the robustness o... more Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improving the state-of-the-art effectiveness on several datasets in different contexts (i.e., news, biomedical, and real estate data) and for different languages (English and Dutch).
Proceedings of ACL 2018, System Demonstrations
Many Machine Reading and Natural Language Understanding tasks require reading supporting text in ... more Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of related tasks would allow for expressive modelling, and easier model comparison and replication. To that end, we present Jack the Reader (JACK), a framework for Machine Reading that allows for quick model prototyping by component reuse, evaluation of new models on existing datasets as well as integrating new datasets and applying them on a growing set of implemented baseline models. JACK is currently supporting (but not limited to) three tasks: Question Answering, Natural Language Inference, and Link Prediction. It is developed with the aim of increasing research efficiency and code reuse.
Expert Systems with Applications
State-of-the-art models for joint entity recognition and relation extraction strongly rely on ext... more State-of-the-art models for joint entity recognition and relation extraction strongly rely on external natural language processing (NLP) tools such as POS (part-of-speech) taggers and dependency parsers. Thus, the performance of such joint models depends on the quality of the features obtained from these NLP tools. However, these features are not always accurate for various languages and contexts. In this paper, we propose a joint neural model which performs entity recognition and relation extraction simultaneously, without the need of any manually extracted features or the use of any external tool. Specifically, we model the entity recognition task using a CRF (Conditional Random Fields) layer and the relation extraction task as a multi-head selection problem (i.e., potentially identify multiple relations for each entity). We present an extensive experimental setup, to demonstrate the effectiveness of our method using datasets from various contexts (i.e., news, biomedical, real estate) and languages (i.e., English, Dutch). Our model outperforms the previous neural models that use automatically extracted features, while it performs within a reasonable margin of feature-based neural models, or even beats them.
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
In this paper, we address the (to the best of our knowledge) new problem of extracting a structur... more In this paper, we address the (to the best of our knowledge) new problem of extracting a structured description of real estate properties from their natural language descriptions in classifieds. We survey and present several models to (a) identify important entities of a property (e.g., rooms) from classifieds and (b) structure them into a tree format, with the entities as nodes and edges representing a part-of relation. Experiments show that a graphbased system deriving the tree from an initially fully connected entity graph, outperforms a transition-based system starting from only the entity nodes, since it better reconstructs the tree.
Microwave and Wireless Components Letters Ieee, Apr 1, 2008
Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval - ICMR '16, 2016
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13, 2013
ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for e... more ABSTRACT Search engines can improve their efficiency by selecting only few promising shards for each query. State-of-the-art shard selection algorithms first query a central index of sampled documents, and their effectiveness is similar to searching all shards. However, the search in the central index also hurts efficiency. Additionally, we show that the effectiveness of these approaches varies substantially with the sampled documents. This paper proposes Taily, a novel shard selection algorithm that models a query's score distribution in each shard as a Gamma distribution and selects shards with highly scored documents in the tail of the distribution. Taily estimates the parameters of score distributions based on the mean and variance of the score function's features in the collections and shards. Because Taily operates on term statistics instead of document samples, it is efficient and has deterministic effectiveness. Experiments on large web collections (Gov2, CluewebA and CluewebB) show that Taily achieves similar effectiveness to sample-based approaches, and improves upon their efficiency by roughly 20% in terms of used resources and response time.
2009 International Conference on Electromagnetics in Advanced Applications, 2009
ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary ... more ABSTRACT This paper describes an accurate method to discretize the Dirichlet-to-Neumann boundary operator for a convex polygonal conductor. The technique is based on an expansion of the boundary value of the current density. Because the corresponding expansion functions exhibit the exact current behavior inside the conductor, they ensure a very good accuracy up to skin effect frequencies. In combination with a classical boundary integral method and the Method of Moments, the Dirichlet-to-Neumann technique allows for a direct determination of the resistive and inductive properties of transmission line configurations constructed from these conductors, as is illustrated with some numerical examples.
IEEE Transactions on Microwave Theory and Techniques, 2000
This paper presents a new multiconductor transmission line model for general 2-D lossy configurat... more This paper presents a new multiconductor transmission line model for general 2-D lossy configurations based on mode reciprocity. Particular attention is devoted to elucidate the validity of the quasi-TM model and the approximations that have to be invoked to obtain this model. A new derivation of the complex capacitance matrix is given, especially taking into account the presence of semiconductors. This derivation automatically leads to a nonclassical circuit signal current definition and demands for a formulation of the complex inductance problem consistent with that definition. The relevant resistance, inductance, conductance, and capacitance circuit matrices are obtained by solving boundary integral equations only, making use of the Dirichlet to Neumann boundary operator for the different materials. This allows to simulate complex metal-insulator-semiconductor structures, as shown in the numerical examples.
IEEE Microwave and Wireless Components Letters, 2000
Accurately modeling interconnect structures is an important issue in high-frequency chip design. ... more Accurately modeling interconnect structures is an important issue in high-frequency chip design. Conductors have a finite thickness and conductivity, and are often composed of different metals. It is shown that the Dirichlet-to-Neumann technique can be used to model the inductive and resistive behavior of such structures, up to high frequencies at which the skin effect is well-developed. Furthermore, the method can be used for the accurate and fast calculation of the longitudinal current distribution in the composite conductors.
Uploads
Papers by Thomas Demeester