Skip to main content
HDT a is binary RDF serialization aiming at minimizing the space overheads of traditional RDF formats, while providing retrieval features in compressed space. Several HDT-based applications, such as the recent Linked Data Fragments... more
    • by 
    •   5  
      Data CompressionWeb of DataSemantic WebRDF
In this paper, we analytically derive, implement, and empirically evaluate a solution for maximizing the execution rate of Map-Reduce jobs subject to power constraints in data centers. Our solution is novel in that it takes into account... more
    • by 
    •   7  
      Computer ScienceDistributed ComputingGreen ComputingDistributed System
In last two decades continues increase of comput-ational power and recent advance in the web technology cause to provide large amounts of data. That needs large scale data processing mechanism to handle this volume of data. MapReduce is a... more
    • by 
    •   20  
      Computer ScienceComputer EngineeringComputer NetworksClouds
Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more
    • by 
    •   6  
      Computer ScienceHadoopMapreduceBig Data
    • by 
    •   17  
      Computer ScienceNetwork SecurityComputer SecurityVideo Streaming
Twitter produces a massive amount of data due to its popularity that is one of the reasons underlying big data problems. One of those problems is the classification of tweets due to use of sophisticated and complex language, which makes... more
    • by 
    •   7  
      Cognitive ScienceVisualizationMachine LearningClassification
Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed... more
    • by 
    •   2  
      MapreduceBigdata and Hadoop
The emergence of big data analytics as a way of deriving insights from data brought excitement to mathematicians, statisticians, computer scientists and other professionals. However, the absence of a mathematical foundation for analytics... more
    • by 
    •   13  
      Set TheoryComputer ScienceFormal Concept Analysis (Data Mining)Knowledge Discovery in Databases
Data outsourcing allows data owners to keep their data at untrusted clouds that do not ensure the privacy of data and/or computations. One useful framework for fault-tolerant data processing in a distributed fashion is MapReduce, which... more
    • by 
    •   17  
      Information SecurityDatabase SystemsPrivacySecurity
Even though there are lots of invented systems that have implemented customer analytics, it’s still an upcoming and unexplored market that has greater potential for better advancements. Big data is one of the most raising technology... more
    • by 
    •   6  
      Data VisualizationHadoopMapreduceDecision Tree
The computer industry is being challenged to develop methods and techniques for affordable data processing on large datasets at optimum response times. The technical challenges in dealing with the increasing demand to handle vast... more
    • by 
    •   4  
      Cloud ComputingHadoopMapreduceParallel and Distributed Processing
Big Data is large-volume of data generated by public web, social media and different networks, business applications, scientific instruments, types of mobile devices and different sensor technology. Data mining involves knowledge... more
    • by 
    •   5  
      HadoopClusteringMapreduceBig Data
A flexible, efficient and secure networking architecture is required in order to process big data. However, existing network architectures are mostly unable to handle big data. As big data pushes network resources to the limits it results... more
    • by 
    •   15  
      Network SecurityComputer SecurityVideo StreamingHadoop
Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more
    • by  and +2
    •   5  
      HadoopMapreduceBig DataDistributed File Systems
Data Mining refers to the process of mining useful data over large datasets. The discovery of interesting association relationships among large amounts of business transactions is currently vital for making appropriate business decisions.... more
    • by 
    •   3  
      Association Rule MiningMapreduceApriori Algorithm
— In today’s age of information technology processing data is a very important issue. Nowadays even terabytes and petabytes of data is not sufficient for storing large chunks of database. The data is too big, moves too fast, or doesn’t... more
    • by 
    •   3  
      HadoopMapreduceDistributed File System
Map Reduce has gained remarkable significance as a prominent parallel data processing tool in the research community, academia and industry with the spurt in volume of data that is to be analyzed. Map Reduce is used in different... more
    • by 
    •   5  
      HadoopMapreduceComputer science & Information TechnologyComputer Science and Information Technology
Nowadays, the stock market is attracting more and more people's notice with its high challenging risks and high return over. A stock exchange market depicts savings and investments that are advantageous to increase the effectiveness of... more
    • by  and +1
    •   9  
      Distributed ComputingParallel ComputingVirtualizationCloud Computing
Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more
    • by 
    •   5  
      HadoopMapreduceBig DataDistributed File Systems
Twitter, a micro-blogging service, has been generating a large amount of data every minute as it gives people chance to express their thoughts and feelings quickly and clearly about any topics. To obtain the desired information from these... more
    • by 
    •   6  
      Social MediaTwitterHadoopClassification
In distributed systems, databases migration is not an easy task. Companies will encounter challenges moving data including legacy data to the big data platform. This paper shows how to optimize migration from traditional databases to the... more
    • by 
    •   4  
      NoSQLHadoopMapreduceCAP Theorem
Accurate recognition and differentiation of the human facial expressions require substantial computational power, where the efficiency of algorithm plays a vital role. Recent advancement in the human computer interaction and object... more
    • by 
    •   20  
      Information SystemsComputer ScienceInformation ScienceHuman Computer Interaction
Profound attention to MapReduce framework has been caught by many different areas. It is presently a practical model for data-intensive applications due to its simple interface of programming, high scalability, and ability to withstand... more
    • by 
    •   4  
      Cloud ComputingHadoopMapreduceBig Data
This document presents the handbook of BigDataBench (Version 3.1). BigDataBench is an open-source big data benchmark suite, publicly available from http://prof.ict.ac.cn/BigDataBench. After identifying diverse data models and... more
    • by 
    •   10  
      Computer ScienceMPIMapreduceBig Data
Hadoop and Spark are widely used distributed processing frameworks for large-scale data processing in an efficient and fault-tolerant manner on private or public clouds. These big-data processing systems are extensively used by many... more
    • by  and +1
    •   6  
      Distributed ComputingMapreduceBig DataMapReduce and Hadoop
In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the... more
    • by  and +4
    •   4  
      NoSQLMapreduceBig DataBig Data Analytics
An effective technique to process and analyse large amounts of data is achieved through using the MapReduce framework. It is a programming model which is used to rapidly process vast amount of data in parallel and distributed mode... more
    • by 
    •   5  
      Seismic data processingCloud ComputingHadoopMapreduce
The concept of Association rule mining is an important task in data mining. In case of big data the large volume of data makes is impossible to generate rules at a faster pace. By making use of parallel execution in Hadoop using the... more
    • by  and +1
    •   5  
      Data MiningAssociation Rules MiningHadoopMapreduce
The efficiency and scalability of the cluster depends heavily on the performance of the single NameNode.
    • by 
    •   7  
      Computer ScienceData MiningDatabase SystemsDatabases
In order to make accurate and fast keywords and full text searches it is recommended to index the words in the corpus. One way to do this is to use an inverted index to maintain in a structured form the words occurrence in a set of... more
    • by 
    •   5  
      Database SystemsMapreduceStemmingMongodb
Graphs are everywhere in our lives: social networks, the World Wide Web, biological networks, and many more. The size of real-world graphs are growing at unprecedented rate, spanning millions and billions of nodes and edges. What are the... more
    • by 
    •   5  
      Visual AnalyticsAnomaly DetectionHadoopGraph Mining
In present times, updated information and knowledge has become readily accessible to researchers, enthusiasts, developers, and academics through the Internet on many different subjects for wider areas of application. The underlying... more
    • by  and +1
    •   20  
      BenchmarkingAuthentic E-Learning, E-Mentoring, Virtual BenchmarkingHadoopMPI
Apriori is designed to operate on databases containing transactions (for example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having... more
    • by 
    •   31  
      EngineeringElectrical EngineeringElectronic EngineeringMechanical Engineering
Data has become an indispensable part of every economy, industry, organization, business function and individual. Big Data is a term used to identify the datasets that whose size is beyond the ability of typical database software tools to... more
    • by 
    •   3  
      SecurityHadoopMapreduce
Map-Reduce is a programming model and an associated implementation for processing and generating large data sets. This model has a single point of failure: the master, who coordinates the work in a cluster. On the contrary, wireless... more
    • by  and +1
    •   18  
      Fault Tolerant ComputingDistributed ComputingFault Tolerant SystemsMiddleware
In recent years, big data are generated from a variety of sources, and there is an enormous demand for storing, managing, processing, and querying on big data. The MapReduce framework and its open source implementation Hadoop, has proven... more
    • by 
    •   5  
      HadoopMapreduceBig DataStream Processing
Bigdata is a horizontally-scaled storage, opensource architecture for indexed data and computing fabric supporting optional transactions, very high concurrency and operates in both a single machine mode and a cluster mode. The bigdata... more
    • by 
    •   7  
      Computer ScienceSoftware EngineeringData MiningDatabase Systems
The main aim of this paper is to reduce the burden of the single reducer per map. Now-a-day, MapReduce performance improvement is very significant for big data processing. In previous works, we tried to reduce the network traffic cost for... more
    • by 
    •   3  
      MapreduceLoad BalancingMapReduce and Hadoop
Distributed denial of service (DDoS) attacks continues to grow as a threat to organizations worldwide. From the first known attack in 1999 to the highly publicized Operation Ababil, the DDoS attacks have a history of flooding the victim... more
    • by 
    •   5  
      HadoopDDoSMapreduceCharacteristics
Abstract: Apache Hadoop is a software framework that supports data-intensive distributed application under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s... more
    • by 
    •   2  
      MapreduceHDFS
During the recreation of monetary competence, data is entirety and everything is data. However data is dependent upon the world that is chaotic, insane, unpredictable, and sentimental. The acceleration of so called big data and the... more
    • by 
    •   5  
      NoSQLHadoopMapreduceBig Data
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more
    • by 
    •   5  
      Machine LearningHadoopMapreduceParameters
PageRank evaluates the importance of Web pages with link relations. However, there is no direct method of evaluating the meaning of links in a hyperlink-based Web structure. This feature may cause problems in that pages containing many... more
    • by 
    •   9  
      Semantic Web TechnologiesWeb ProgrammingLarge scale systemsMapreduce
At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly used software tools to capture, manage and process such large-scale data... more
    • by 
    •   12  
      Computer ScienceInformation SecurityComputer EngineeringMachine Learning
MapReduce is a simple and powerful programming model which enables development of scalable parallel applications to process large amount of data which is scattered on a cluster of machines. The original implementations of Map Reduce... more
    • by 
    •   2  
      HadoopMapreduce
Pretty much every part of life now results in the generation of data. Logs are documentation of events or records of system activities and are created automatically through IT systems. Log data analysis is a process of making sense of... more
    • by 
    •   5  
      HadoopMapreduceBig DataDistributed File Systems
The Size of the data is increasing day by day with the using of social site. Big Data is a concept to manage and mine the large set of data. Today the concept of Big Data is widely used to mine the insight data of organization as well... more
    • by 
    •   5  
      Data MiningHadoopMapreduceBig Data
Database storage storing abundant data usually accompanies slow performance of query and data manipulation. This thesis presents a model and methodology of faster data manipulation (insert/delete) of mass data rows stored in a big table.... more
    • by 
    •   4  
      MapreduceBig DataPartitioningBig Table
Big data is an assemblage of large and complex data that is difficult to process with the traditional DBMS tools. The scale, diversity, and complexity of this huge data demand new analytics techniques to extract useful and hidden value... more
    • by 
    •   4  
      MapreduceBig DataDiscretizationPreprocessing