Papers by Marcela Xavier Ribeiro
The hepatitis dataset was analyzed based on the mining of multirelational association rules. Expe... more The hepatitis dataset was analyzed based on the mining of multirelational association rules. Experiments were conducted to analyze data on blood and urine exams and biopsy results to infer information on the behavior of degrees of fibrosis. Multirelational association rules were obtained using a new algorithm called Connection that identifies patterns in different tables without join them. The results of our analysis are discussed here, as is the Connection algorithm.
The hepatitis dataset was analyzed based on the mining of multirelational association rules. Expe... more The hepatitis dataset was analyzed based on the mining of multirelational association rules. Experiments were conducted to analyze data on blood and urine exams and biopsy results to infer information on the behavior of degrees of fibrosis. Multirelational association rules were obtained using a new algorithm called Connection that identifies patterns in different tables without join them. The results of our analysis are discussed here, as is the Connection algorithm.
Page 1. Mineração de Regras de Associação Usando Agrupamentos Marcela Xavier Ribeiro1,2, Marina T... more Page 1. Mineração de Regras de Associação Usando Agrupamentos Marcela Xavier Ribeiro1,2, Marina Teresa Pires Vieira1,3, Agma Juci Machado Traina2 1 Departamento de Ciências da Computação, Universidade Federal ...
Data warehouse (DW) is a large, oriented-subject, non-volatile, and historical database, and an i... more Data warehouse (DW) is a large, oriented-subject, non-volatile, and historical database, and an important component of Business Intelligence. On DW are executed OLAP (Online Analytical Processing) queries that often culminate in a high response time. Fragmentation of data, materialized views and indices aim to improve performance in processing these queries. Additionally, NoSQL (Not only SQL) database are used instead of the relational database, to improve specific aspects such as performance in query processing. In this sense, in this paper is investigated and compared DW implementations using relational databases and NoSQL. We evaluated the response times in processing queries, memory usage and CPU usage percentage, considering the queries of the Star Schema Benchmark. As a result, the column-oriented model implemented by the software FastBit, showed gains in time of 25.4% to 99.8% when compared to other NoSQL models and relational in query processing. Resumo. Data warehouse (DW) ...
Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology - CSTST '08, 2008
Data pre-processing is a key element to improve the accuracy of data mining algorithms. In the pr... more Data pre-processing is a key element to improve the accuracy of data mining algorithms. In the pre-processing step, the data are treated in order to make the mining process achievable and effective. Data discretization and feature selection are two important tasks that can be performed prior to the learning phase and can significantly reduce the processing effort of the data
Resumo - Os dois principais pesadelos que diminuem a qualidade da busca por conteúdo são: a) a &q... more Resumo - Os dois principais pesadelos que diminuem a qualidade da busca por conteúdo são: a) a "maldição da alta dimensionalidade", que degrada as estruturas de índice e diminui o poder de discriminação das características extraídas das imagens e b) o "gap semântico" existente entre a representação das características de baixo nível e sua interpretação humana. Neste artigo é proposto um novo método para aumentar a precisão das buscas por conteúdo de imagens médicas que combina técnicas de mineração de regras de associação e de realimentação de relevância. Regras de associação estatísticas são usadas para selecionar as características com maior poder de discriminação das imagens lidando com o problema da maldição da alta dimensionalidade. Uma técnica eficiente de realimentação de relevância é usada para lidar com o problema do gap semântico. Experimentos mostram que o método proposto é eficaz levando a um aumento na precisão das buscas de até 100%. Palavras-chave:...
Proceedings of the IEEE Symposium on Computer-Based Medical Systems
Feature selection can significantly improve the precision of content-based queries in image datab... more Feature selection can significantly improve the precision of content-based queries in image databases by removing noisy features or by bursting the most relevant ones. Continuous feature selection techniques assign continuous weights to each feature according to their relevance. In this paper, we propose a supervised method for continuous feature selection. The proposed method applies statistical association rules to find patterns relating low-level image features to high-level knowledge about the images, and it uses the patterns mined to determine the weight of the features. The feature weighting through the statistical association rules reduces the semantic gap that exists between low-level features and the high-level user interpretation of images, improving the precision of the content-based queries. Moreover the proposed method performs dimensionality reduction of image features avoiding the "dimensionality curse" problem. Experiments show that the proposed method impr...
Abstract. In this paper, we propose new techniques to improve the quality of similarity queries o... more Abstract. In this paper, we propose new techniques to improve the quality of similarity queries over image databases performing association rule mining over textual descriptions and automatically extracted features of the image content. Based on the knowledge mined, each query ...
Uploads
Papers by Marcela Xavier Ribeiro