Mapreduce Research Papers

HDT a is binary RDF serialization aiming at minimizing the space overheads of traditional RDF formats, while providing retrieval features in compressed space. Several HDT-based applications, such as the recent Linked Data Fragments... more

Bookmark
Download
- by José M Giménez-García
- •
- 5
  Data Compression, Web of Data, Semantic Web, RDF

In this paper, we analytically derive, implement, and empirically evaluate a solution for maximizing the execution rate of Map-Reduce jobs subject to power constraints in data centers. Our solution is novel in that it takes into account... more

Bookmark
Download
- by Shen Li
- •
- 7
  Computer Science, Distributed Computing, Green Computing, Distributed System

In last two decades continues increase of comput-ational power and recent advance in the web technology cause to provide large amounts of data. That needs large scale data processing mechanism to handle this volume of data. MapReduce is a... more

Bookmark
Download
- by Bahman Rashidi
- •
- 20
  Computer Science, Computer Engineering, Computer Networks, Clouds

Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more

Bookmark
Download
- by Gligor Risteski
- •
- 6
  Computer Science, Hadoop, Mapreduce, Big Data

Twitter produces a massive amount of data due to its popularity that is one of the reasons underlying big data problems. One of those problems is the classification of tweets due to use of sophisticated and complex language, which makes... more

Bookmark
Download
- by Umit Demirbaga
- •
- 7
  Cognitive Science, Visualization, Machine Learning, Classification

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed... more

Bookmark
Download
- by Heerok Mutsuddy
- •
- 2
  Mapreduce, Bigdata and Hadoop

The emergence of big data analytics as a way of deriving insights from data brought excitement to mathematicians, statisticians, computer scientists and other professionals. However, the absence of a mathematical foundation for analytics... more

Data outsourcing allows data owners to keep their data at untrusted clouds that do not ensure the privacy of data and/or computations. One useful framework for fault-tolerant data processing in a distributed fashion is MapReduce, which... more

Bookmark
Download
- by Shantanu Sharma
- •
- 17
  Information Security, Database Systems, Privacy, Security

Even though there are lots of invented systems that have implemented customer analytics, it’s still an upcoming and unexplored market that has greater potential for better advancements. Big data is one of the most raising technology... more

The computer industry is being challenged to develop methods and techniques for affordable data processing on large datasets at optimum response times. The technical challenges in dealing with the increasing demand to handle vast... more

Bookmark
Download
- by ijwsc journal
- •
- 4
  Cloud Computing, Hadoop, Mapreduce, Parallel and Distributed Processing

Big Data is large-volume of data generated by public web, social media and different networks, business applications, scientific instruments, types of mobile devices and different sensor technology. Data mining involves knowledge... more

A flexible, efficient and secure networking architecture is required in order to process big data. However, existing network architectures are mostly unable to handle big data. As big data pushes network resources to the limits it results... more

Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more

Bookmark
Download
- by Journal of Emerging Computer Technologies and +2
  Gligor Risteski
  Beyza Ali
- •
- 5
  Hadoop, Mapreduce, Big Data, Distributed File Systems

Data Mining refers to the process of mining useful data over large datasets. The discovery of interesting association relationships among large amounts of business transactions is currently vital for making appropriate business decisions.... more

Data Mining refers to the process of mining useful data over large datasets. The discovery of interesting association relationships among large amounts of business transactions is currently vital for making appropriate business decisions. This is the reason that the research in data mining is carried out largely for business decision making rather than for academic importance. Association rule analysis is the task of discovering association rules that occur frequently in a given transaction data set. Its task is to find certain relationships among a set of data (itemset) in the database. It has two measurements: Support and confidence values. Confidence value is a measure of ruleÃ¢â‚¬â„¢s strength, while support value corresponds to statistical significance. There are currently a variety of algorithms to discover association rules. Most of the algorithms need a specification of minimum support value as user input. Specifying minimum support values of items is not recommended as it leads to very less or very large rules. With a sufficiently high support value, the less frequent elements gets eliminated, leaving only the elements which are most frequent. Thus, knives and spoons may get eliminated leaving only biscuits and milk. One approach for this problem is proposed by MsApriori Algorithm. However, both Apriori and MsApriori are computationally complex and need large computational time for large datasets over traditional machines. One solution to this problem is proposed by Dynamic Matrix Apriori which is much faster as compared to traditional Apriori in the generation of candidate sets. The contribution of this paper is twofold. It first proposed a method to use MsAprioiri using Dynamix Matrix Technique. It then proposes a framework to use the Algorithm under the Map Reduce Programming model. Experiments on large set of data bases have been conducted to validate the proposed framework. The achieved results show that there is a remarkable improvement in the overall performance of the system in terms of run time, the number of generated rules, and number of frequent items used.

— In today’s age of information technology processing data is a very important issue. Nowadays even terabytes and petabytes of data is not sufficient for storing large chunks of database. The data is too big, moves too fast, or doesn’t... more

Bookmark
Download
- by anumol johnson
- •
- 3
  Hadoop, Mapreduce, Distributed File System

Map Reduce has gained remarkable significance as a prominent parallel data processing tool in the research community, academia and industry with the spurt in volume of data that is to be analyzed. Map Reduce is used in different... more

Nowadays, the stock market is attracting more and more people's notice with its high challenging risks and high return over. A stock exchange market depicts savings and investments that are advantageous to increase the effectiveness of... more

Bookmark
Download
- by Mahantesh Angadi and +1
  Amogh kulkarni
- •
- 9
  Distributed Computing, Parallel Computing, Virtualization, Cloud Computing

Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more

Bookmark
Download
- by Beyza Ali
- •
- 5
  Hadoop, Mapreduce, Big Data, Distributed File Systems

Twitter, a micro-blogging service, has been generating a large amount of data every minute as it gives people chance to express their thoughts and feelings quickly and clearly about any topics. To obtain the desired information from these... more

Bookmark
Download
- by Umit Demirbaga
- •
- 6
  Social Media, Twitter, Hadoop, Classification

In distributed systems, databases migration is not an easy task. Companies will encounter challenges moving data including legacy data to the big data platform. This paper shows how to optimize migration from traditional databases to the... more

Bookmark
Download
- by trust odia
- •
- 4
  NoSQL, Hadoop, Mapreduce, CAP Theorem

Accurate recognition and differentiation of the human facial expressions require substantial computational power, where the efficiency of algorithm plays a vital role. Recent advancement in the human computer interaction and object... more

Profound attention to MapReduce framework has been caught by many different areas. It is presently a practical model for data-intensive applications due to its simple interface of programming, high scalability, and ability to withstand... more

Bookmark
Download
- by Seyednima Khezr
- •
- 4
  Cloud Computing, Hadoop, Mapreduce, Big Data

This document presents the handbook of BigDataBench (Version 3.1). BigDataBench is an open-source big data benchmark suite, publicly available from http://prof.ict.ac.cn/BigDataBench. After identifying diverse data models and... more

Bookmark
Download
- by Leonardo Sui
- •
- 10
  Computer Science, MPI, Mapreduce, Big Data

Hadoop and Spark are widely used distributed processing frameworks for large-scale data processing in an efficient and fault-tolerant manner on private or public clouds. These big-data processing systems are extensively used by many... more

Bookmark
Download
- by Shantanu Sharma and +1
  Ido Singer
- •
- 6
  Distributed Computing, Mapreduce, Big Data, MapReduce and Hadoop

In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the... more

Bookmark
Download
- by Katarina Grolinger and +4
  Wilson A Higashino
  Miriam Capretz
  Miriam Capretz
  Miriam Capretz
- •
- 4
  NoSQL, Mapreduce, Big Data, Big Data Analytics

An effective technique to process and analyse large amounts of data is achieved through using the MapReduce framework. It is a programming model which is used to rapidly process vast amount of data in parallel and distributed mode... more

The concept of Association rule mining is an important task in data mining. In case of big data the large volume of data makes is impossible to generate rules at a faster pace. By making use of parallel execution in Hadoop using the... more

Bookmark
Download
- by GRD JOURNALS and +1
  M.Jansi Rani
- •
- 5
  Data Mining, Association Rules Mining, Hadoop, Mapreduce

The efficiency and scalability of the cluster depends heavily on the performance of the single NameNode.

Bookmark
Download
- by N P
- •
- 7
  Computer Science, Data Mining, Database Systems, Databases

In order to make accurate and fast keywords and full text searches it is recommended to index the words in the corpus. One way to do this is to use an inverted index to maintain in a structured form the words occurrence in a set of... more

Bookmark
Download
- by Ciprian-Octavian Truică
- •
- 5
  Database Systems, Mapreduce, Stemming, Mongodb

Graphs are everywhere in our lives: social networks, the World Wide Web, biological networks, and many more. The size of real-world graphs are growing at unprecedented rate, spanning millions and billions of nodes and edges. What are the... more

Bookmark
Download
- by Metoo Ghazi
- •
- 5
  Visual Analytics, Anomaly Detection, Hadoop, Graph Mining

In present times, updated information and knowledge has become readily accessible to researchers, enthusiasts, developers, and academics through the Internet on many different subjects for wider areas of application. The underlying... more

Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having... more

Data has become an indispensable part of every economy, industry, organization, business function and individual. Big Data is a term used to identify the datasets that whose size is beyond the ability of typical database software tools to... more

Map-Reduce is a programming model and an associated implementation for processing and generating large data sets. This model has a single point of failure: the master, who coordinates the work in a cluster. On the contrary, wireless... more

Bookmark
Download
- by Alexandros Gazis and +1
  Eli Katsiri
- •
- 18
  Fault Tolerant Computing, Distributed Computing, Fault Tolerant Systems, Middleware

In recent years, big data are generated from a variety of sources, and there is an enormous demand for storing, managing, processing, and querying on big data. The MapReduce framework and its open source implementation Hadoop, has proven... more

Bigdata is a horizontally-scaled storage, opensource architecture for indexed data and computing fabric supporting optional transactions, very high concurrency and operates in both a single machine mode and a cluster mode. The bigdata... more

Bookmark
Download
- by vishad 16kaushik
- •
- 7
  Computer Science, Software Engineering, Data Mining, Database Systems

The main aim of this paper is to reduce the burden of the single reducer per map. Now-a-day, MapReduce performance improvement is very significant for big data processing. In previous works, we tried to reduce the network traffic cost for... more

Bookmark
Download
- by GJESR Journal
- •
- 3
  Mapreduce, Load Balancing, MapReduce and Hadoop

Distributed denial of service (DDoS) attacks continues to grow as a threat to organizations worldwide. From the first known attack in 1999 to the highly publicized Operation Ababil, the DDoS attacks have a history of flooding the victim... more

Bookmark
Download
- by Dr.Ammar Almomani
- •
- 5
  Hadoop, DDoS, Mapreduce, Characteristics

Abstract: Apache Hadoop is a software framework that supports data-intensive distributed application under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s... more

Bookmark
Download
- by Research Publish Journals
- •
- 2
  Mapreduce, HDFS

During the recreation of monetary competence, data is entirety and everything is data. However data is dependent upon the world that is chaotic, insane, unpredictable, and sentimental. The acceleration of so called big data and the... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  NoSQL, Hadoop, Mapreduce, Big Data

The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more

PageRank evaluates the importance of Web pages with link relations. However, there is no direct method of evaluating the meaning of links in a hyperlink-based Web structure. This feature may cause problems in that pages containing many... more

At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly used software tools to capture, manage and process such large-scale data... more

MapReduce is a simple and powerful programming model which enables development of scalable parallel applications to process large amount of data which is scattered on a cluster of machines. The original implementations of Map Reduce... more

Bookmark
Download
- by Madhavi Vaidya
- •
- 2
  Hadoop, Mapreduce

Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more

Bookmark
Download
- by Atanas Hristov
- •
- 5
  Hadoop, Mapreduce, Big Data, Distributed File Systems

The Size of the data is increasing day by day with the using of social site. Big Data is a concept to manage and mine the large set of data. Today the concept of Big Data is widely used to mine the insight data of organization as well... more

Bookmark
Download
- by IJSRD Journal
- •
- 5
  Data Mining, Hadoop, Mapreduce, Big Data

Database storage storing abundant data usually accompanies slow performance of query and data manipulation. This thesis presents a model and methodology of faster data manipulation (insert/delete) of mass data rows stored in a big table.... more

Bookmark
Download
- by SDIWC Organization
- •
- 4
  Mapreduce, Big Data, Partitioning, Big Table

Big data is an assemblage of large and complex data that is difficult to process with the traditional DBMS tools. The scale, diversity, and complexity of this huge data demand new analytics techniques to extract useful and hidden value... more

Bookmark
Download
- by Vandna Dahiya
- •
- 4
  Mapreduce, Big Data, Discretization, Preprocessing

Mapreduce

Log In