Papers by Leonardo Rigutini
The Web consists of a large amount of unstructured information that hardly can be elaborated by a... more The Web consists of a large amount of unstructured information that hardly can be elaborated by automatic agents. In recent years, a considerable number of techniques for information extraction from Web resources have been proposed. In particular, while many different approaches have been devised to automatically identify the structure of the data in a document (f.i. Named Entity Recognition, Part Of Speech tagging), few systems exist to assign the semantics to such data. This fact limits significantly the use of these systems in data integration since a human intervention for labeling the extracted entities is required. Once these data have been semantically labeled, in fact, they can be used to fill databases, to build lexicons and to provide additional attributes for document categorization or clustering tasks. This paper proposes an evolution of the system for automatically categorizing terms or lexical entities presented in [5]. We added a submodule for multi-labels classification and tested the system using a standard benchmark, comparing the performances with the work presented in [1, 2].
Due to the globalization on the Web, many companies and institutions need to efficiently organize... more Due to the globalization on the Web, many companies and institutions need to efficiently organize and search repositories containing multilingual documents. The management of these heterogeneous text collections increases the costs significantly because experts of different languages are required to organize these collections. Cross-Language Text Categorization can provide techniques to extend existing automatic classification systems in one language to new languages without requiring additional intervention of human experts. In this paper we propose a learning algorithm based on the EM scheme which can be used to train text classifiers in a multilingual environment. In particular, in the proposed approach, we assume that a predefined category set and a collection of labeled training data is available for a given language L 1. A classifier for a different language L 2 is trained by translating the available labeled training set for L 1 to L 2 and by using an additional set of unlabeled documents from L 2. This technique allows us to extract correct statistical properties of the language L 2 which are not completely available in automatically translated examples, because of the different characteristics of language L 1 and of the approximation of the translation process. Our experimental results show that the performance of the proposed method is very promising when applied on a test document set extracted from newsgroups in English and Italian.
IEEE transactions on neural networks and learning systems, Nov 1, 2018
Structured data in the form of labeled graphs (with variable order and topology) may be thought o... more Structured data in the form of labeled graphs (with variable order and topology) may be thought of as the outcomes of a random graph (RG) generating process characterized by an underlying probabilistic law. This paper formalizes the notions of generalized RG (GRG) and probability density function (pdf) for GRGs. Thence, a "universal" learning machine (combining the encoding module of a recursive neural network and a radial basis functions' network) is introduced for estimating the unknown pdf from an unsupervised sample of GRGs. A maximum likelihood training algorithm is presented and constrained so as to ensure that the resulting model satisfies the axioms of probability. Techniques for preventing the model from degenerate solutions are proposed, as well as variants of the algorithm suitable to the tasks of graphs classification and graphs clustering. The major properties of the machine are discussed. The approach is validated empirically through experimental investigations in the estimation of pdfs for synthetic and real-life GRGs, in the classification of images from the Caltech Benchmark data set and molecules from the Mutagenesis data set, and in clustering of images from the LabelMe data set.
European Conference on Artificial Intelligence, May 22, 2006
Abstract Term weighting is a crucial task in many Information Retrieval applications. Common appr... more Abstract Term weighting is a crucial task in many Information Retrieval applications. Common approaches are based either on statistical or on natural language analysis. In this paper, we present a new algorithm that capitalizes from the advantages of both the strategies. In the proposed method, the weights are computed by a parametric function, called Context Function, that models the semantic influence exercised amongst the terms. The Context Function is learned by examples, so that its implementation is mostly ...
Language Resources and Evaluation, May 1, 2018
Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results... more Supervised models for Word Sense Disambiguation (WSD) currently yield to state-of-the-art results in the most popular benchmarks. Despite the recent introduction of Word Embeddings and Recurrent Neural Networks to design powerful context-related features, the interest in improving WSD models using Semantic Lexical Resources (SLRs) is mostly restricted to knowledge-based approaches. In this paper, we enhance "modern" supervised WSD models exploiting two popular SLRs: WordNet and WordNet Domains. We propose an effective way to introduce semantic features into the classifiers, and we consider using the SLR structure to augment the training data. We study the effect of different types of semantic features, investigating their interaction with local contexts encoded by means of mixtures of Word Embeddings or Recurrent Neural Networks, and we extend the proposed model into a novel multi-layer architecture for WSD. A detailed experimental comparison in the recent Unified Evaluation Framework (Raganato et al., 2017) shows that the proposed approach leads to supervised models that compare favourably with the state-of-the art.
This paper presents a software system that is able to gen- erate crosswords with no human interve... more This paper presents a software system that is able to gen- erate crosswords with no human intervention including def- inition generation and crossword compilation. In particu- lar, the proposed system crawls relevant sources of the Web, extracts definitions from the downloaded pages using state- of-the-art Natural Language Processing (NLP) techniques and, finally, attempts at compiling a crossword schema with the
The face expression is the first thing we pay attention to when we want to understand a person's ... more The face expression is the first thing we pay attention to when we want to understand a person's state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85% for the InceptionResNetV2 model.
International Journal of Artificial Intelligence & Applications, Jan 31, 2022
The face expression is the first thing we pay attention to when we want to understand a person's ... more The face expression is the first thing we pay attention to when we want to understand a person's state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85% for the InceptionResNetV2 model.
Document clustering is a very hard task in Automatic Text Processing since it requires to extract... more Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the category structure. This task can be difficult also for humans because many different but valid partitions may exist for the same collection. Moreover, the lack of information about categories makes it difficult to apply effective feature selection techniques to reduce the noise in the representation of texts. Despite these intrinsic difficulties, text clustering is an important task for Web search applications in which huge collections or quite long query result lists must be automatically organized. Semi-supervised clustering lies in between automatic categorization and auto-organization. It is assumed that the supervisor is not required to specify a set of classes, but only to provide a set of texts grouped by the criteria to be used to organize the collection. In this paper we present a novel algorithm for clustering text documents which exploits the EM algorithm together with a feature selection technique based on Information Gain. The experimental results show that only very few documents are needed to initialize the clusters and that the algorithm is able to properly extract the regularities hidden in a huge unlabeled collection.
IEEE Transactions on Neural Networks, Sep 1, 2011
Relevance ranking consists in sorting a set of objects with respect to a given criterion. However... more Relevance ranking consists in sorting a set of objects with respect to a given criterion. However, in personalized retrieval systems, the relevance criteria may usually vary among different users and may not be predefined. In this case, ranking algorithms that adapt their behavior from users' feedbacks must be devised. Two main approaches are proposed in the literature for learning to rank: the use of a scoring function, learned by examples, that evaluates a feature-based representation of each object yielding an absolute relevance score, a pairwise approach, where a preference function is learned to determine the object that has to be ranked first in a given pair. In this paper, we present a preference learning method for learning to rank. A neural network, the comparative neural network (CmpNN), is trained from examples to approximate the comparison function for a pair of objects. The CmpNN adopts a particular architecture designed to implement the symmetries naturally present in a preference function. The learned preference function can be embedded as the comparator into a classical sorting algorithm to provide a global ranking of a set of objects. To improve the ranking performances, an active-learning procedure is devised, that aims at selecting the most informative patterns in the training set. The proposed algorithm is evaluated on the LETOR dataset showing promising performances in comparison with other state-of-the-art algorithms.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track
Real-world business applications require a trade-off between language model performance and size.... more Real-world business applications require a trade-off between language model performance and size. We propose a new method for model compression that relies on vocabulary transfer. We evaluate the method on various vertical domains and downstream tasks. Our results indicate that vocabulary transfer can be effectively used in combination with other compression techniques, yielding a significant reduction in model size and inference time while marginally compromising on performance.
International Joint Conference on Artificial Intelligence, Jan 6, 2007
Term weighting systems are of crucial importance in Information Extraction and Information Retrie... more Term weighting systems are of crucial importance in Information Extraction and Information Retrieval applications. Common approaches to term weighting are based either on statistical or on natural language analysis. In this paper, we present a new algorithm that capitalizes from the advantages of both the strategies by adopting a machine learning approach. In the proposed method, the weights are computed by a parametric function, called Context Function, that models the semantic influence exercised amongst the terms of the same context. The Context Function is learned from examples, allowing the use of statistical and linguistic information at the same time. The novel algorithm was successfully tested on crossword clues, which represent a case of Single-Word Question Answering.
The face expression is the first thing we pay attention to when we want to understand a person’s ... more The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85% for the InceptionResNetV2 model.
NLP Techniques and Applications, 2021
The face expression is the first thing we pay attention to when we want to understand a person’s ... more The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85% for the InceptionResNetV2 model.
International Journal of Artificial Intelligence & Applications, 2022
The face expression is the first thing we pay attention to when we want to understand a person’s ... more The face expression is the first thing we pay attention to when we want to understand a person’s state of mind. Thus, the ability to recognize facial expressions in an automatic way is a very interesting research field. In this paper, because the small size of available training datasets, we propose a novel data augmentation technique that improves the performances in the recognition task. We apply geometrical transformations and build from scratch GAN models able to generate new synthetic images for each emotion type. Thus, on the augmented datasets we fine tune pretrained convolutional neural networks with different architectures. To measure the generalization ability of the models, we apply extra-database protocol approach, namely we train models on the augmented versions of training dataset and test them on two different databases. The combination of these techniques allows to reach average accuracy values of the order of 85% for the InceptionResNetV2 model.
IEEE Transactions on Neural Networks and Learning Systems, 2018
Structured data in the form of labeled graphs (with variable order and topology) may be thought o... more Structured data in the form of labeled graphs (with variable order and topology) may be thought of as the outcomes of a random graph (RG) generating process characterized by an underlying probabilistic law. This paper formalizes the notions of generalized RG (GRG) and probability density function (pdf) for GRGs. Thence, a "universal" learning machine (combining the encoding module of a recursive neural network and a radial basis functions' network) is introduced for estimating the unknown pdf from an unsupervised sample of GRGs. A maximum likelihood training algorithm is presented and constrained so as to ensure that the resulting model satisfies the axioms of probability. Techniques for preventing the model from degenerate solutions are proposed, as well as variants of the algorithm suitable to the tasks of graphs classification and graphs clustering. The major properties of the machine are discussed. The approach is validated empirically through experimental investigations in the estimation of pdfs for synthetic and real-life GRGs, in the classification of images from the Caltech Benchmark data set and molecules from the Mutagenesis data set, and in clustering of images from the LabelMe data set.
Machine Learning, 2011
We propose a general framework to incorporate first-order logic (FOL) clauses, that are thought o... more We propose a general framework to incorporate first-order logic (FOL) clauses, that are thought of as an abstract and partial representation of the environment, into kernel machines that learn within a semi-supervised scheme. We rely on a multi-task learning scheme where each task is associated with a unary predicate defined on the feature space, while higher level abstract representations consist of FOL clauses made of those predicates. We re-use the kernel machine mathematical apparatus to solve the problem as primal optimization of a function composed of the loss on the supervised examples, the regularization term, and a penalty term deriving from forcing real-valued constraints deriving from the predicates. Unlike for classic kernel machines, however, depending on the logic clauses, the overall function to be optimized is not convex anymore. An important contribution is to show that while tackling the optimization by classic numerical schemes is likely to be hopeless, a stage-based learning scheme, in which we start learning the supervised examples until convergence is reached, and then continue by forcing the logic clauses is a viable direction to attack the problem. Some promising experimental results are given on artificial learning tasks and on the automatic tagging of bibtex entries to emphasize the comparison with plain kernel machines.
International Journal on Artificial Intelligence Tools, 2012
Crossword puzzles are used everyday by millions of people for entertainment, but have application... more Crossword puzzles are used everyday by millions of people for entertainment, but have applications also in educational and rehabilitation contexts. Unfortunately, the generation of ad-hoc puzzles, especially on specific subjects, typically requires a great deal of human expert work. This paper presents the architecture of WebCrow-generation, a system that is able to generate crosswords with no human intervention, including clue generation and crossword compilation. In particular, the proposed system crawls information sources on the Web, extracts definitions from the downloaded pages using state-of-the-art natural language processing techniques and, finally, compiles the crossword schema with the extracted definitions by constraint satisfaction programming. The system has been tested on the creation of Italian crosswords, but the extensive use of machine learning makes the system easily portable to other languages.
Uploads
Papers by Leonardo Rigutini