Web Data Mining
232 Followers
Recent papers in Web Data Mining
In this paper, we present an overview of research issues in web mining. We discuss mining with respect to web data referred here as web data mining. In particular, our focus is on web data mining research in context of our web warehousing... more
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new... more
Increasing progress in numerous research fields and information technologies, led to an increase in the publication of research papers. Therefore, researchers take a lot of time to find interesting research papers that are close to their... more
The number of user-contributed comments is increasing exponentially. Such comments are found widely in social media sites including internet discussion forums and news agency websites. In this paper, we summarize the current approaches to... more
The number of user-contributed comments is increasing exponentially. Such comments are found widely in social media sites including internet discussion forums and news agency websites. In this paper, we summarize the current approaches to... more
Web mining refers to the whole of data mining and related techniques that are used to automatically discover and extract information from web documents and services. When used in a business context and applied to some type of personal... more
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new... more
In recent years, the emergence of WWW (World Wide Web) led to the accumulation of huge amount of information and data. Hence the web is found to consist of unstructured and structured information that impacts the day to day life of the... more
With the huge amount of information available online, the World Wide Web is a fertile area for data mining research. The Web mining research is at the cross road of research from several research communities, such as database, information... more
Modern e-commerce applications require data schemas that are constantly evolving and sparsely populated. The Internet, e-commerce and e-business definitely hold an important key to every organization’s future. Resting on the innovation... more
This paper introduces the concept of product identity-clustering based on new similarity metrics and new performance metrics for web-crawled products. Product identity-clustering is defined here as the clustering of identical products,... more
An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new... more
With the huge amount of information available online, the World Wide Web is a fertile area for data mining research. The Web mining research is at the cross road of research from several research communities, such as database, information... more
With the huge amount of information available online, the World Wide Web is a fertile area for data mining research. The Web mining research is at the cross road of research from several research communities, such as database, information... more
Design and implementation of a research support system for web data mining has become a challenge for researchers wishing to utilize useful information on the web. This paper proposes a framework for web data mining support systems. These... more
Des données du Web pour faire de la sociologie… du Web ?
Aggression in online social networks has been studied up to now, mostly with several machine learning methods which detect such behavior in a static context. However, the way aggression diffuses in the network has received little... more
The explosive growth of the Web has drastically increased information circulation and dissemination rates. As the numbers of both Web users and Web sources grow significantly every day, crucial data management issues, such as clustering... more
We propose two new XML applications, XGMML and LOGML. XGMML is a graph description language and LOGML is a web-log report description language. We generate a web graph in XGMML format for a web site using the web robot of the WWWPal... more
In recent years, the emergence of WWW (World Wide Web) led to the accumulation of huge amount of information and data. Hence the web is found to consist of unstructured and structured information that impacts the day to day life of the... more
In recent years, the emergence of WWW (World Wide Web) led to the accumulation of huge amount of information and data. Hence the web is found to consist of unstructured and structured information that impacts the day to day life of the... more
Information retrieval is the process of searching within a document collection for information most relevant to a user's query. However, the type of document collection significantly affects the methods and algorithms used to process... more
International Journal of Web & Semantic Technology (IJWesT) is a quarterly open access peer-reviewed journal that provides excellent international forum for sharing knowledge and results in theory, methodology and applications of web &... more
Web is a wide, various and dynamic environment in which different users publish their documents. Web-mining is one of data mining applications in which web patterns are explored. Studies on web mining can be categorized into three... more
Internet has become one of the most used tools for job search. Many websites identify millions of job vacancies; then these large datasets require practical technologies and analytical methods for extraction and data cleansing process.... more
c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 1 0 0 ( 2 0 1 0 ) 16-23 a b s t r a c t This paper explores Technosocial Predictive Analytics (TPA) and related methods for Web "data mining" where users' posts... more
Objectives: The primary objective of this research paper is to design a new and efficient clustering technique to group user navigation patterns which are useful for classification system to classify a new user with the previous users... more
A distributed frameworks cluster is a gathering of machines that are for all intents and purposes or topographically isolated and that cooperate to give a similar services or application to customers in web application. It is conceivable... more
I n this paper, we discuss association rules which can be discovered from web data. The association rules are discussed within the scope of our WHOWEDA (warehouse of web data) project. WHOWEDA is supported by a web data niodel arid a set... more
These days, billions of Web pages are created with HTML or other markup languages. They only have a few uniform structures and contain various authoring styles compared to traditional text-based documents. However, users usually focus on... more
The present paper proposes a theoretical approach to a language use that is considered exceptional to the rule. Specifically, under the title ‘category invasion’, this paper investigates a phenomenon in which neighboring categories with... more
The Semantic Web is considered a data integration system for different content and applications, in which every item has a specified meaning that machines can understand and process without the intervention of a human. Triple stores are... more
Abstract To gain the competitive advantage in today's age of technology, growing data and to bear the competitive pressure, making strong decisions according to customer's need and market trend has become very important.... more
The one of the most time consuming steps for association rule mining is the computation of the frequency of the occurrences of itemsets in the database. The hash table index approach converts a transaction database to an hash index tree... more
The explosive growth of the web has drastically changed the way in which information is managed and accessed. The large-scale of web data sources and the wide availability of services over the internet have increased the need for... more
The one of the most time consuming steps for association rule mining is the computation of the frequency of the occurrences of itemsets in the database. The hash table index approach converts a transaction database to an hash index tree... more
Web is a collection of interrelated files on one or more web servers while web mining means extracting valuable information from web databases. Web mining is one of the data mining domains where data mining techniques are used for... more
The cost of acquiring training data instances for induction of data mining models is one of the main concerns in real-world problems. The web is a comprehensive source for many types of data which can be used for data mining tasks. But... more
Academic industries used to collect feedback from the students on the main aspects of course such as preparations, contents, delivery methods, punctual, skills, appreciation, and learning experience. The feedback is collected in terms of... more
Web mining is a computation intensive task, even after the mining tool itself has been developed. Most mining software are developed ad-hoc and usually are not scalable nor reused for other mining tasks. The objective of this paper is to... more
Deep web contents are accessed by queries submitted to web databases and the returned data records are enwrapped in dynamically generated web pages (they will be called deep Web pages). Extracting structured data from deep web pages is a... more
Nowadays YouTube becoming most popular video sharing website, and is established in 2005. The YouTube official website is providing different categories videos including Science and Technology, Films and Animation, News and politics,... more