HDFS
17 Followers
Recent papers in HDFS
Centos üzerinde Hadoop nasıl kurulur, adım adım anlatılıyor.
— Cloud computing is one of the emerging techniques to process the big data. Cloud computing is also, known as service on demand. Large set or large volume of data is known as big data. Processing big data (MRI images and DICOM images)... more
The Cancer is a key disease that has become the greatest risk to public health cause to its complicated early recognition. According to a study by the WHO in 2019 and so far, there are four million new cases of cancer and 28.69 million... more
Hadoop é o principal framework usado para processar e gerenciar grandes quantidades de dados. Qualquer pessoa que trabalhe com programação ou ciência de dados deve se familiarizar com a plataforma.
Big Data is large-volume of data generated by public web, social media and different networks, business applications, scientific instruments, types of mobile devices and different sensor technology. Data mining involves knowledge... more
Since Big data is so huge that it's become difficult to handle it, so it requires special technology which can handle bigdata. Hadoop is Apache Foundation's Framework which aims to provide efficient storage and analytics of big data;... more
Dissertation report on hdfc LIC LTD
Dans le monde d’aujourd’hui de multiples acteurs de la technologie numérique produisent des quantités infinies de données. Capteurs, réseaux sociaux ou e-commerce, ils génèrent tous de l’information qui s’incrémente en temps-réel selon... more
With an increased usage of the internet, the data usage is also getting increased exponentially year on year. So obviously to handle such an enormous data we needed a better platform to process data. So a programming model was introduced... more
The efficiency and scalability of the cluster depends heavily on the performance of the single NameNode.
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached... more
Abstract: Apache Hadoop is a software framework that supports data-intensive distributed application under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s... more
Now a day’s Peta-Bytes of data becomes the norm in industries. Handling, analyzing such big data is challenging task. Even frameworks like Hadoop (Open Source Implementation of MapReduce Paradigm) and NoSQL databases like Cassandra, HBase... more
Data is being produced by the firms in ever increasing rates and firms are finding new ways to make use of data to create business value. The generated volumes of data create the need for better and cheaper storage options that allows... more
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more
HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is... more
The combination of the two quick creating logical exploration regions Semantic Web and Web Mining is called Semantic Web Mining. The immense increment in the measure of Semantic Web information turned into an ideal objective for some... more
Big data is a collection of structured and unstructured data sets that include the huge quantities of data, social media analytics, data management capabilities, real-time data. For Big Data processing Hadoop uses Map Reduce paradigm.... more
Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully balance the benefits in... more
Data analytics has been rapidly growing in a variety of application areas like mining business intellect for processing the huge amount of data. MapReduce programming paradigm adds itself well to these data-intensive analytics jobs, given... more
Quantitative trace element data from high-purity gem diamonds from the Victor Mine, Ontario, Canada as well as near-gem diamonds from peridotite and eclogite xenoliths from the Finsch and Newlands mines, South Africa, acquired using an... more
Big data may be a gather of structured, semi-structured and unstructured data sets that contain the large amount of data, social media analytics, information management ability, period of time information. For giant data processing Hadoop... more
The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached... more
The MapReduce model has become an important parallel processing model for largescale data-intensive applications like data mining and web indexing. Hadoop, an opensource implementation of MapReduce, is widely applied to support cluster... more
The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more
Clustering is a process of grouping objects that are similar among themselves but dissimilar to objects in others. Clustering large dataset is a challenging resource data intensive task. The key to scalability and performance benefits it... more
In today"s world where Internet is most required and where pentabytes of data is produced per hour, there is a drastic need to speed up the performance and throughput of the cloud system. Traditional cloud systems were not able to give... more
There is an explosion in the volume of data in the world. The amount of data is increasing by leaps and bounds. The sources are individuals, social media, organizations, etc. The data may be structured, semi-structured or unstructured.... more
The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more
The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more
HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is... more
This paper describes the outcome of an attempt to implement the same transitive closure (TC) algorithm for Apache MapReduce running on different Apache Hadoop distributions. Apache MapReduce is a software framework used with Apache... more
Abstract- Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and... more
Since Big data is so huge that it's become difficult to handle it, so it requires special technology which can handle bigdata. Hadoop is Apache Foundation's Framework which aims to provide efficient storage and analytics... more
The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more
Nowadays, producing streams of data is not helpful if you cannot store them somewhere. Applications, software, and objects generate huge masses of data, which need to be collected, stored, and made available for analysis. Moreover, these... more
The objective of the proposed system is to integrate the high volume of data along with the important considerations like monitoring a wide array of heterogeneous security. When a real time cyber attack occurred, the Intrusion Detection... more
With the explosion of data in applications all around us, erasure coded storage has emerged as an attractive alternative to replication because even with significantly lower storage overhead, they provide better reliability against data... more
Erasure codes are an integral part of many distributed storage systems aimed at Big Data, since they provide high fault-tolerance for low overheads. However, traditional erasure codes are inefficient on replenishing lost data (vital for... more
New invention of advanced technology, enhanced capacity of storage media, maturity of information technology and popularity of social media, business intelligence and Scientific invention, produces huge amount of data which made ample set... more