Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008
…
394 pages
1 file
AbstrAct Knowledge discovery is a compute-and data-intensive process that allows for finding patterns, trends, and models in large datasets. The grid can be effectively exploited for deploying knowledge discovery applications because of the high performance it can offer and its distributed infrastructure. For effective use of grids in knowledge discovery, the development of middleware is critical to support data management, data transfer, data mining and knowledge representation.
Data Mining and Knowledge Discovery in Real Life Applications, 2009
In the last few decades, Grid technologies have emerged as an important area in parallel and distributed computing. The Grid can be seen as a computational and large-scale support, and even in some cases as a high-performance support. In recent years, the data mining community have been increasingly using Grid facilities to store, share, manage and mine large-scale data-driven applications. Indeed, data mining and knowledge discovery applications are by nature distributed, and are using the Grid as their execution environment. This particularly led to a great interest of the community in distributed data mining and knowledge discovery on large Grid platforms. Many Grid-based Data Mining (DM) and Knowledge Discovery (KD) frameworks were initiated, and proposed different techniques and solutions for large-scale datasets mining. These include the ADMIRE project initiated by the PCRG (Parallel Computational Research Group) at the University College Dublin, the Knowledge Grid project at the University of Calabria, The GridMiner project at the University of Vienna, among others. These knowledge discovery 1 frameworks on the Grid aim to offer high-level abstractions and techniques for distributed management, mining, and knowledge extraction from data repositories and warehouses. Most of them use existing Grid technologies and systems to build specific knowledge discovery services, data management, analysis, and mining techniques. Basically, this consists of either porting existing algorithms and applications on the Grid, or developing new mining and knowledge extraction techniques, by exploiting the Grid features and services. Grid infrastructures usually provide basic services of communication, authentication, storage and computing resources, data placement and management, etc. For example, the Knowledge Grid system uses services provided by the Globus Toolkit, and the ADMIRE framework uses a Grid system called DGET, developed by our team at the University College Dublin. We will give some details about the best-known DM/KD frameworks in section 2. Note that this chapter is not intended to Grid systems or the way they are interfaced with knowledge discovery frameworks. Indeed, beyond the architecture design of Grid systems, the resources and data management policies, the data integration or placement techniques, and so on, these DM and KD frameworks need 1 Knowledge Discovery is a more general term that includes the Data Mining process. It refers to the overall knowledge extraction process.
2017
During the last decade or so, we have had a deluge of data from not only science fields but also industry and commerce fields. Although the amount of data available to us is constantly increasing, our ability to process it becomes more and more difficult. Efficient discovery of useful knowledge from these datasets is therefore becoming a challenge and a massive economic need. This led to the need of developing large-scale data mining (DM) techniques to deal with these huge datasets either from science or economic applications. In this chapter, we present a new DDM system combining dataset-driven and architecture-driven strategies. Data-driven strategies will consider the size and heterogeneity of the data, while architecture driven will focus on the distribution of the datasets. This system is based on a Grid middleware tools that integrate appropriate large data manipulation operations. Therefore, this allows more dynamicity and autonomicity during the mining, integrating and proce...
2003
The Grid is today mainly used for supporting high-performance computing intensive applications. However, it could be effectively exploited for deploying data-driven and knowledge discovery applications. To support this class of applications, tools and services for knowledge discovery are vital. The Knowledge Grid is an high-level system for providing Grid-based knowledge discovery services.
Lecture Notes in Computer Science, 2001
Knowledge discovery tools and techniques are used in an increasing number of scientific and commercial areas for the analysis of large data sets. When large data repositories are coupled with geographic distribution of data, users and systems, it is necessary to combine different technologies for implementing high-performance distributed knowledge discovery systems. On the other hand, computational grid is emerging as a very promising infrastructure for high-performance distributed computing. In this paper we introduce a software architecture for parallel and distributed knowledge discovery (PDKD) systems that is built on top of computational grid services that provide dependable, consistent, and pervasive access to high-end computational resources. The proposed architecture uses the grid services and defines a set of additional layers to implement the services of distributed knowledge discovery process on grid-connected sequential or parallel computers.
2003
The increasing use of computers in all the areas of human activities is resulting in huge collections of digital data. Databases are common everywhere and are used as repositories of every kind of data. Knowledge discovery techniques and tools are used today to analyze those very large data sets to identify interesting patterns and trends in them. When data is maintained over geographically distributed sites the computational power of distributed and parallel systems can be exploited for knowledge discovery in databases.
Proc. of the 4th International Conference on Algorithms and Architectures for Parallel Computing (ICA3PP), 2000
In an increasing number of scientific and commercial areas, tools and systems for the analysis of large data sets are emerging as important resources. In particular, when large data sets are coupled with geographic distribution of data, users and systems, it is necessary to combine different technologies for implementing high-performance distributed knowledge discovery systems. The discipline that study and use those tools is named Parallel and Distributed Knowledge Discovery (PDKD). In this paper we introduce a reference software architecture for PDKD systems that is built on top of computational grids that provide dependable, consistent, and pervasive access to high-end computational resources. The proposed architecture uses the grid services and defines a set of additional layers to implement the services of distributed knowledge discovery process on distributed computers where each node can be a sequential or a parallel machine.
Abstract Grid infrastructures can be used for developing data and knowledge intensive applications as they offer resources, services, and access mechanisms for data management and knowledge extraction from remote data sources. Data mining algorithms and knowledge discovery processes are both compute and data intensive, therefore the Grid can offer a computing and data management infrastructure for supporting distributed high-peformance data analysis.
2004
Abstract. The Grid is mainly used today for supporting high-performance compute intensive applications. However, it is going to be effectively exploited for deploying data-driven and knowledge discovery applications. To support this class of applications, high-level tools and services are vital. The Knowledge Grid is an high-level system for providing Grid-based knowledge discovery services.
Concurrency and Computation: Practice and Experience, 2007
KDDML-G is a middleware language and system for knowledge discovery on the grid. The challenge that motivated the development of a grid-enabled version of the 'standalone' KDDML (Knowledge Discovery in Databases Markup Language) environment was on one side to exploit the parallelism offered by the grid environment, and on the other side to overcome the problem of data immovability, a quite frequent restriction on real-world data collections that has principally a privacy-preserving purpose. The last question is addressed by moving the code and 'mining' the data 'on the place', that is by adapting the computation to the availability and localization of the data. A. ROMEI ET AL. community have produced protocols, services, and tools that address the challenges concerning scalable virtual organizations.
Acta electronica Universitas Lapponiensis, 2018
Les premières compagnies dans l Atlantique 1600-1650, 2023
Boletín de Estética, 2023
Archeologia Veneta, XLVI, 2023
Love and Revolution in Twentieth Century Colonial and Postcolonial World: Perspectives from South Asia and Southern Africa, 2021
Systemic Practice and Action Research, 2018
Заметки по греческой колонизации, 2013
International Journal of Surgery Case Reports, 2021
Journal Of Mechanical Engineering, Science, And Innovation, 2021
JURNAL ILMIAH GLOBAL EDUCATION, 2023
Casopis Za Suvremenu Povijest, 2007
Gastroenterology, 2014
Nie tylko „Po Prostu”. Prasa w PRL w dobie „odwilży” (1955–1958), red. M. Przeperski, P. Sasanka, Warszawa 2019
Latin American Journal of Aquatic Research