HDFS Research Papers - Academia.edu

Centos üzerinde Hadoop nasıl kurulur, adım adım anlatılıyor.

— Cloud computing is one of the emerging techniques to process the big data. Cloud computing is also, known as service on demand. Large set or large volume of data is known as big data. Processing big data (MRI images and DICOM images)... more

The Cancer is a key disease that has become the greatest risk to public health cause to its complicated early recognition. According to a study by the WHO in 2019 and so far, there are four million new cases of cancer and 28.69 million... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  Datamining, HDFS, Namenode, Datanode

Hadoop é o principal framework usado para processar e gerenciar grandes quantidades de dados. Qualquer pessoa que trabalhe com programação ou ciência de dados deve se familiarizar com a plataforma.

Bookmark
Download
- by Fernando Anselmo
- •
- 6
  Hadoop, Data Science, Map Reduce, Big Data

Big Data is large-volume of data generated by public web, social media and different networks, business applications, scientific instruments, types of mobile devices and different sensor technology. Data mining involves knowledge... more

Since Big data is so huge that it's become difficult to handle it, so it requires special technology which can handle bigdata. Hadoop is Apache Foundation's Framework which aims to provide efficient storage and analytics of big data;... more

Bookmark
Download
- by IJRASET Publication
- •
- 5
  Multi-Agent Systems, Hadoop, Big Data, Kerberos

Dissertation report on hdfc LIC LTD

Dans le monde d’aujourd’hui de multiples acteurs de la technologie numérique produisent des quantités infinies de données. Capteurs, réseaux sociaux ou e-commerce, ils génèrent tous de l’information qui s’incrémente en temps-réel selon... more

Bookmark
Download
- by ahmed boughedda
- •
- 4
  NoSQL, Hadoop, Big Data, HDFS

With an increased usage of the internet, the data usage is also getting increased exponentially year on year. So obviously to handle such an enormous data we needed a better platform to process data. So a programming model was introduced... more

Big data is the collection of large amount of data which is generated from various application such as social media, e-commerce etc. Those large amount of data were found to be tedious for storage and analysis. Now a day's various tools... more

Bookmark
Download
- by GRD JOURNALS
- •
- 6
  Hadoop, Big Data, Pig, Hive

With the tremendous amount of research done in the field of numerical methods for engineering, a sharp rise in the number of new algorithms and software tools (academic and commercial) have been observed in the past decades. The advent of... more

With the tremendous amount of research done in the field of numerical methods for engineering, a sharp rise in the number of new algorithms and software tools (academic and commercial) have been observed in the past decades. The advent of such software tools has not only made the application of the methods easier, but it has also drastically increased the number of users and the applications of these numerical methods to engineering problems. However, mostly these tools have been developed independently for solving problems of a particular kind, by particular methods on a particular scale, which leads to some limitations. One of these limitations, interoperability, is discussed in this paper and a methodology to resolve it for a small use-case of finite element problems at a continuum scale has been proposed. The main idea revolves around the fact that, popular tools use formats like VTK, STL, or XDMF, among others, for transferring data of mesh, geometry, loads, etc., and not all the tools can understand or process these formats in a useful manner. To simply state it, there is not one standard format which can enable complete interoperability among these tools. In fact, this problem of interoperability also keeps us from transferring data from one scale to another, e.g. from an RVE to a continuum scale [1]. To enable the interoperability among tools, a methodology has been proposed to export data from Abaqus Output Database (ODB) [2] directly to HDF5 [3] containers as an example. In this paper, a brief overview of the HDF5 data model is given, followed by a discussion pointing out why this HDF5 data model is a suitable interoperability tool to be used in the Finite Element Analysis (FEA) of engineering problems. After a short introduction of the Abaqus ODB and its underlying hierarchical structure, the procedure to import external libraries to Abaqus-Python is discussed along with the scripting procedure in Python. The customization of HDF5 containers is done for thermo-mechanical problems using h5py, a Python API for HDF5 files. The metadata for the problem is collected from Abaqus v6.14, and a methodology for the export of data from Abaqus to HDF5 containers is defined. Finally, an example of an HDF5 data model created using our scripts and the future scope of the work is shown. The hierarchical structure of ODB and MDB (Model Database) are similar, and they serve the two main purposes in Abaqus software. The methodology we devise to extract data from ODB, can also be used to write data to a MDB, directly from HDF5 files after some alterations in the scripts. The ODB can be understood as a read-only mode, whereas MDB can be considered as read-write mode.

Bookmark
Download
- by Carlos Agelet de Saracibar and +1
  Miguel Cervera
- •
- 3
  Interoperability, HDFS, Integrated Computational Materials Engineering

Hadoop and Spark are widely used distributed processing frameworks for large-scale data processing in an efficient and fault-tolerant manner on private or public clouds. These big-data processing systems are extensively used by many... more

Bookmark
Download
- by Shantanu Sharma and +1
  Ido Singer
- •
- 6
  Distributed Computing, Mapreduce, Big Data, MapReduce and Hadoop

The efficiency and scalability of the cluster depends heavily on the performance of the single NameNode.

Bookmark
Download
- by N P
- •
- 7
  Computer Science, Data Mining, Database Systems, Databases

With the growing technology and exploding data the need to manage data and process it in real-time is also increasing. Hence all the incoming data needs to be handled in a fraction of seconds to get the value out of it. This type of data... more

Bookmark
Download
- by JASH MATHEW
- •
- 6
  Hadoop, Big Data, Kafka, Hive

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached... more

Abstract: Apache Hadoop is a software framework that supports data-intensive distributed application under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s... more

Bookmark
Download
- by Research Publish Journals
- •
- 2
  Mapreduce, HDFS

Now a day’s Peta-Bytes of data becomes the norm in industries. Handling, analyzing such big data is challenging task. Even frameworks like Hadoop (Open Source Implementation of MapReduce Paradigm) and NoSQL databases like Cassandra, HBase... more

Bookmark
Download
- by IAEME Publication
- •
- 4
  Hadoop, HDFS, Heterogeneous clusters, Iaeme Ijcet

Data is being produced by the firms in ever increasing rates and firms are finding new ways to make use of data to create business value. The generated volumes of data create the need for better and cheaper storage options that allows... more

Bookmark
Download
- by Hussein Negm
- •
- 7
  Data Warehousing, Hadoop, Data Modelling, Map Reduce

The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more

HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is... more

Bookmark
Download
- by Subhi R M Zeebaree
- •
- 4
  Hadoop, Mapreduce, Big Data, HDFS

The combination of the two quick creating logical exploration regions Semantic Web and Web Mining is called Semantic Web Mining. The immense increment in the measure of Semantic Web information turned into an ideal objective for some... more

Bookmark
Download
- by IJRASET Publication
- •
- 5
  RDF, Pig Farming, Hive, HDFS

Big data is a collection of structured and unstructured data sets that include the huge quantities of data, social media analytics, data management capabilities, real-time data. For Big Data processing Hadoop uses Map Reduce paradigm.... more

Big data is used for structured, unstructured and semi-structured large volume of data which is difficult to manage and costly to store. Using explanatory analysis techniques to understand such raw data, carefully balance the benefits in... more

Data analytics has been rapidly growing in a variety of application areas like mining business intellect for processing the huge amount of data. MapReduce programming paradigm adds itself well to these data-intensive analytics jobs, given... more

Bookmark
Download
- by Vishal Dubey
- •
- 8
  Computer Science, Hadoop, Map Reduce, Big Data

Quantitative trace element data from high-purity gem diamonds from the Victor Mine, Ontario, Canada as well as near-gem diamonds from peridotite and eclogite xenoliths from the Finsch and Newlands mines, South Africa, acquired using an... more

Bookmark
Download
- by Mandy Y Krebs
- •
- 3
  Diamonds, Trace Elements, HDFS

Big data may be a gather of structured, semi-structured and unstructured data sets that contain the large amount of data, social media analytics, information management ability, period of time information. For giant data processing Hadoop... more

Bookmark
Download
- by IJSRD Journal
- •
- 6
  Machine Learning, Hadoop, Map Reduce, Big Data

In recent years, Hadoop framework is popularly known for providing cost-effective solutions to process large scale data intensive applications in a distributed manner. Storage imbalance during replica placement in Hadoop is harmful,... more

Bookmark
Download
- by Amrita Patole and +1
  S.d Madhu Kumar
- •
- 3
  Hadoop, Load Balancing, HDFS

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached... more

Bookmark
Download
- by sunday sun
- •
- 7
  Computer Science, Database Systems, Computer Networks, Databases

The MapReduce model has become an important parallel processing model for largescale data-intensive applications like data mining and web indexing. Hadoop, an opensource implementation of MapReduce, is widely applied to support cluster... more

Bookmark
Download
- by Madhavi Vaidya
- •
- 6
  Computer Science, Machine Learning, Cloud Computing, Distributed

Log Analysis is a critical procedure in most framework and system exercises where log information is utilized for different reasons, for example, for execution checking, security examining or notwithstanding for revealing and profiling.... more

The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more

Bookmark
Download
- by IAEME Publication
- •
- 6
  Data Mining, SPSS, K-means, Decision Tree

Clustering is a process of grouping objects that are similar among themselves but dissimilar to objects in others. Clustering large dataset is a challenging resource data intensive task. The key to scalability and performance benefits it... more

Bookmark
Download
- by Editor IJRET
- •
- 4
  Clustering, Mapreduce, Big Data, HDFS

In today"s world where Internet is most required and where pentabytes of data is produced per hour, there is a drastic need to speed up the performance and throughput of the cloud system. Traditional cloud systems were not able to give... more

Bookmark
Download
- by IJERD Editor
- •
- 6
  Wireless Communications, Computer Networks, Benchmarking, Hadoop

Sentiment Analysis is the process of using text analytics to mine various data sources for opinions. Often, sentiment analysis is done on the data that is got from the Internet and from various social media platforms. Because the content... more

Bookmark
Download
- by GRD JOURNALS and +1
  Kavitha .V
- •
- 6
  Sentiment Analysis, Big Data, Key generation, Hive

There is an explosion in the volume of data in the world. The amount of data is increasing by leaps and bounds. The sources are individuals, social media, organizations, etc. The data may be structured, semi-structured or unstructured.... more

Bookmark
Download
- by Somya Singh
- •
- 4
  Scheduling, Hadoop, Big Data, HDFS

The Apache Hadoop framework is an open source implementation of MapReduce for processing and storing big data. However, to get the best performance from this is a big challenge because of its large number configuration parameters. In this... more

Bookmark
Download
- by Armanur Rahman
- •
- 5
  Machine Learning, Hadoop, Mapreduce, Parameters

The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more

Bookmark
Download
- by IAEME Publication
- •
- 6
  Data Mining, SPSS, K-means, Decision Tree

HADOOP is an open-source virtualization technology that allows the distributed processing of large data sets across standardized server clusters. With two modules, HADOOP Distributed File System (HDFS) and MapReduce framework, it is... more

Bookmark
Download
- by Subhi R M Zeebaree
- •
- 6
  Computer Science, Hadoop, Mapreduce, Big Data

Bookmark
Download
- by Shlomi Dolev
- •
- 7
  Computer Science, Distributed Computing, Mapreduce, Big Data

This paper describes the outcome of an attempt to implement the same transitive closure (TC) algorithm for Apache MapReduce running on different Apache Hadoop distributions. Apache MapReduce is a software framework used with Apache... more

Abstract- Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and... more

Bookmark
Download
- by kanchan khedikar
- •
- 4
  Hadoop, Mapreduce, Big Data, HDFS

Since Big data is so huge that it&#39;s become difficult to handle it, so it requires special technology which can handle bigdata. Hadoop is Apache Foundation&#39;s Framework which aims to provide efficient storage and analytics... more

Bookmark
- by Jolly Khurana
- •
- 5
  Hadoop, Multi Agent Systems, Big Data, Kerberos

The Lung cancer patients are at discriminating threat for COVID-19 and the reported far above the ground humanity time surrounded by lung cancer patients by way of COVID-19 has prearranged break in proceedings to oncologists who are faced... more

Bookmark
Download
- by IAEME Publication
- •
- 5
  Data Mining, SPSS, K-means, Decision Tree

Nowadays, producing streams of data is not helpful if you cannot store them somewhere. Applications, software, and objects generate huge masses of data, which need to be collected, stored, and made available for analysis. Moreover, these... more

Bookmark
- by Allae Erraissi
- •
- 10
  Big Data, Hadoop , BIgdata , NOSQL, Database NoSQL, NoSQL Databases

The objective of the proposed system is to integrate the high volume of data along with the important considerations like monitoring a wide array of heterogeneous security. When a real time cyber attack occurred, the Intrusion Detection... more

With the explosion of data in applications all around us, erasure coded storage has emerged as an attractive alternative to replication because even with significantly lower storage overhead, they provide better reliability against data... more

Erasure codes are an integral part of many distributed storage systems aimed at Big Data, since they provide high fault-tolerance for low overheads. However, traditional erasure codes are inefficient on replenishing lost data (vital for... more

Bookmark
Download
- by Kyumars Sheykh Esmaili
- •
- 5
  Hadoop, Big Data, Erasure Coding, Distributed Storage System

New invention of advanced technology, enhanced capacity of storage media, maturity of information technology and popularity of social media, business intelligence and Scientific invention, produces huge amount of data which made ample set... more

Bookmark
Download
- by Dr. Satanand Mishra
- •
- 2
  Hadoop, HDFS

Bookmark
Download
- by uday bhaskar
- •
- 10
  Computer Science, Cloud Computing, Hadoop, Mapreduce

HDFS

Log In