Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
0 votes
0 answers
12 views

How to handle duplicate key outputs in the Mapper phase for HDFS PageRank implementation?

I was writing the PageRank code to run on HDFS, so I wrote the Mapper and Reducer. The data I have is in the following format: page 'outgoing_links,' such as: Page_1 Page_18,Page_109,Page_696,...
Khaled Saleh's user avatar
-1 votes
1 answer
26 views

Is the Hadoop documentation wrong for set

The documentation of the Hadoop Job API gives as example: From https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/Job.html Here is an example on how to submit a job: // ...
user1551605's user avatar
1 vote
0 answers
19 views

Dataproc Hive Job - OutOfMemoryError: Java heap space

I have a dataproc cluster, we are running INSERT OVERWRITE QUERY through HIVE CLI which fails with OutOfMemoryError: Java heap space. We adjusted memory configurations for reducers and Tez tasks, ...
Parmeet Singh's user avatar
0 votes
1 answer
80 views

Hive Always Fails at Mapreduce

I just installed hadoop 3.3.6 and hive 4.0.0 with mysql as metastore. when running create table or select * from... it runs well. But when I try to do insert or select join, hive always fails. I'm ...
Dzaki Wicaksono's user avatar
0 votes
0 answers
22 views

hadoop execute mapreduce error info: Failing the application

2024-08-29 16:25:51,669 INFO mapreduce.Job: map 0% reduce 0% 2024-08-29 16:25:51,687 INFO mapreduce.Job: Job job_1724916787445_0010 failed with state FAILED due to: Application ...
谢志强's user avatar
0 votes
1 answer
37 views

Map Reduce Job Failing with OOM [org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster]

I'm providing the comma separated filenames to the FileInputFormat in MapReduce Job. My total size of the data is 30Gb compressed snappy orc files. When my map reduce job is starting immediately ...
Nikhil Lingam's user avatar
0 votes
0 answers
15 views

How to read MapContext from jobClient (or another API)?

I'm submitting a job in a hadoop cluster. This job has a file and is using InputFormatClass = NLineInputFormat. After the job starts, it will create several map tasks, each one with a line of the ...
user3783810's user avatar
0 votes
0 answers
18 views

Hbase 2.0.0 with Hadooop 2.6.5 : Hbase shell Unhandled Java exception: java.lang.IncompatibleClassChangeError:java.lang.IncompatibleClassChangeError:

I am running Hadoop 2.6.5 with hbase 2.0.0 . When I try to log into hbase with $hbase shell , i get the following below I tried Hbase 2.5.8 with hadoop 2.6.5 same issue . I tired Hbase 1.3.1 with ...
user3884278's user avatar
0 votes
0 answers
58 views

Hadoop cannot make a MapReduce operation because is getting hang, waiting for AM container to be allocated

I used my teacher's guide to install and configure Hadoop (it's my first time using it). It says it is for Ubuntu, but it should for all linux distros since it just compiles Hadoop from source. Since ...
Alberto's user avatar
  • 25
0 votes
0 answers
21 views

Apache pig throwing Nullpointer exception when filtering using a parameter ERROR 2000: Error processing rule PartitionFilterOptimizer

I am trying to store data in Apache hive using filter operation where MYVARIABLE is a parameter sent. Filtering using the variable sends nullpointer exception .This happens when I try to store data in ...
Luff li's user avatar
  • 133
0 votes
0 answers
30 views

On every run jar file using hadoop it is always stuck

On every run jar file using hadoop it is always stuck here in the last line. Here, try the Foil Jar located in the hadoop file itself, but with the same result, it gets stuck in the last line, ...
Noor Khalil's user avatar
0 votes
0 answers
22 views

Streaming command failed inside Hadoop

I have the following code put in and when I run it inside Hadoop I get the error message. #!/usr/local/hadoop/bin/hdfs dfs -rm -r /users/spatel/output/ !/usr/local/hadoop/bin/hadoop jar /usr/local/...
Shekhar's user avatar
0 votes
0 answers
70 views

AWS Emr Map Reduce job logs are in stderr

I'm running a MR job in EMR and all my logs are in stderr section (when I go into the Job logs from the Resource Manager UI). How can I move them to stdout or syslog ?
Stefan Ss's user avatar
0 votes
0 answers
47 views

Hadoop Mapreduce word count - /tmp/user is not recognized as an internal or external command,operable program or batch file

I´m new to hadoop and I need to use wordcounter for school project. Everything went fine with hadoop installation, until this error which showed when I ran the mapreducer.jar program to count word ...
Natália Gálová's user avatar
0 votes
0 answers
36 views

Unable to Submit MapReduce Job from Java Client to Hadoop Cluster Running in Pseudo-Distributed Mode

I'm working on a project where I need to perform aggregations on the result of an HBase table scan using MapReduce and store the result in another HBase table. To achieve this, I've set up a Hadoop ...
Pedro Gomes's user avatar
0 votes
0 answers
13 views

Hadoop - Final output of the reducer is dependent on values which belong to other keys

To summarize the problem, my input is a list of reviews of bought products together with an associated category. My goal is to rank the Top n terms of each category according to the chi-square value ...
Arjol Pançi's user avatar
0 votes
0 answers
155 views

How does Spark read a single large splittable file? (logical block and physical block)

I have a question about the behavior when reading splittable large files in Spark. I understand that files like Parquet or ORC are splittable. I have also confirmed that when reading a large Parquet ...
matdulgi's user avatar
0 votes
0 answers
20 views

Node Participation in Hadoop-Streaming with Python Scripts

I am working with Hadoop-Streaming version 3.4.0 for a data processing task (work count). I have two Python scripts: mapper.py and reducer.py. I want to ensure that all three nodes in the cluster are ...
Jihyun's user avatar
  • 1,095
0 votes
1 answer
36 views

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 16, column 46> mismatched input ',' expecting LEFT_PAREN

grunt> joined_data = JOIN filtered_features BY (store, date), sales BY (store, date); 2024-04-02 13:19:05,110 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 16, column 46> ...
Md Arif Khan's user avatar
0 votes
0 answers
44 views

Hadoop No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster)

I was trying to run the Word Count mapreduce using eclipse, i created a java project and export it to jar then run it with hadoop. This is the code: import java.io.BufferedReader; import java.io....
Tứ Trần Lê's user avatar
0 votes
1 answer
36 views

Spark Driver vs MapReduce Driver on YARN

I know in spark you can run the driver program on the client machine if you specify `yarn-client` deployment mode. Or you can run it on a random machine in the cluster if you specify `yarn-cluster` ...
Youssef Alaa Etman's user avatar
1 vote
1 answer
39 views

Hadoop MapReduce WordPairsCount produces inconsistent results

I have a very confusing results when I run MapReduce on Hadoop. Here is the code (see below). As you can see, it is a very simple MapReduce operation. The input is 1 directory with 100 .lineperdoc ...
ztsv-av's user avatar
  • 109
0 votes
0 answers
24 views

Hadoop MiniCluster Web UI

Does the hadoop Minicluster supports web UI..?If yes how to access the web UI in Minicluster setup Facing Forbidden error while trying to load web UI.Using Hadoop Version 2.10.2 Maven
dev_coders's user avatar
0 votes
0 answers
36 views

Java lang runtime exception or jar file does not exist error

I am trying to run simple pagerank labtask on my hadoop 3.3.6 installed on ubuntu virtual box but it is giving this error while all my commands are true and my instructor just tole me to download ...
Aminago's user avatar
0 votes
0 answers
18 views

basic python but wierd problem in hadoop-stream text value changes in MapReduce

I am processing a 121983 rows txt file over hadoop. But I met some wierd problem in the mapReduce phase. This is my mapper function: #!/usr/bin/env python import sys import re pattern = r'\b[a-zA-Z0-9]...
Reungu Ju's user avatar
0 votes
1 answer
59 views

Hadoop is writing to file using context.write() but output file turns out empty

I am running a hadoop code, and having problems. Notice the the commented lines "debug exception 1" and "debug exception 2" and the line below each of them. Since I can't print ...
Max's user avatar
  • 1
0 votes
1 answer
20 views

Apache Crunch Job On AWS EMR using Oozie

Context: I want to run an apache crunch job on AWS EMR this job is part of a pipeline of oozie java actions and oozie subworkflows (this particular job is part of a subworkflow). In oozie we have a ...
Stefan Ss's user avatar
2 votes
1 answer
12 views

Hadoop MapReducee WordCountLength - Type mismatch in key from map: expected org.apache.hadoop.io.Text, received org.apache.hadoop.io.IntWritable

I was trying to create a MapReduce application to for WordLengthCount as the below code public class WordLengthCount { public static class TokenizerMapper extends Mapper<Object, Text, ...
Kha Nguyễn Lê Hoàng's user avatar
0 votes
1 answer
65 views

Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.Text is not class org.apache.hadoop.io.FloatWritable

Im running a Hadoop Mapreduce program to calculate the average, maximum and minimum temperature. Temperature is stored in input1.csv file with three columns Date in YYYY-MM-DD format, temperature in ...
Ashok Kumar's user avatar
0 votes
0 answers
39 views

No Output for MapReduce Program even after successful job completion on Cloudera VM

Programming Environment and Brief Overview: I am working on one of my Big Data Assignments that involved finding the Strike Rate of Gamers using Hadoop Mapreduce 2.6.0 version. I am supposed to work ...
Kaivalya's user avatar
0 votes
0 answers
12 views

Context.write method returns wrong result in Mapreduce java

I'm new to hadoop, and I encountered a weird issue. here is the reducer code : @Override protected void reduce(Text key, Iterable<Deck> values, Context context) throws ...
AnwarIZM's user avatar
0 votes
0 answers
11 views

How to get shuffle time in Hadoop 3.2.2?

I have some map/reduce code where I believe the shuffle phase in the mapper (which is performed implicitly by Hadoop) will be shorter as compared to a previously existing approach trying to solve the ...
stroopwafel95's user avatar
-1 votes
1 answer
78 views

Hadoop mapreduce code failed with state FAILED due to: NA

I'm trying to run the below Hadoop mapreduce program. public static class MovieFilterMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private Text movieId = new Text(); ...
Veen's user avatar
  • 161
0 votes
0 answers
15 views

Hadoop Map reduce in Java gives incorrect count for filtered columns

I have the following Map-Reducer in Java which reads data from a text file and filters the records based on 3 conditions. The requirement is to return the number of records in the text file which ...
Kavishka Dulshan's user avatar
-1 votes
1 answer
15 views

MapReduce error:The main class could not be found or loaded

I use hadoop-3.2.2,A Hadoop cluster has just been configured. When using mapreduce to calculate PI, an error is reported Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last ...
JiaRu Xu's user avatar
0 votes
1 answer
22 views

My hadoop reducer writes the output to the context only if I write the original value to the context

I have this code: @Override protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException { Set<String> mySet = new ...
Kinyanjui Karanja's user avatar
0 votes
0 answers
87 views

issue running hadoop mapreduce wordcount

I am running Hadoop version 3.2.4 in windows and want to perform a WordCount operation on the file located in hadoop/share/hadoop/mapreduce/share/mapreduce-examples-3.2.4.jar. However, it failed, and ...
ryan's user avatar
  • 21
0 votes
1 answer
83 views

Trouble Show output using hadoop word count

I'm new to using Hadoop, and I want to execute Hadoop syntax using WordCount to count words. However, why is it that when I try to display the output, it doesn't appear? I would appreciate an ...
ryan's user avatar
  • 21
1 vote
0 answers
30 views

Hadoop mapreduce doesn't use copyied file

Hadoop version: 2.10.2 JDK version: 1.8.0_291 I'm trying to start map_reduce using python. I've configured hadoop on new hduser_. After running this command in terminal: hadoop jar $HADOOP_HOME/share/...
Dorialean's user avatar
0 votes
1 answer
57 views

NoClassDefFoundError: org/apache/hadoop/yarn/util/Clock

I have some errors when run WordCount command: 2023-10-06 15:55:35,005 INFO mapreduce.Job: Job job_1696606856991_0001 running in uber mode: false 2023-10-06 15:55:35,006 INFO mapreduce.Job: map 0% ...
Vũ Phan Bảo Anh's user avatar
1 vote
0 answers
100 views

MapReduce Frameworks That Call Reduce Once vs. 0...N Times

The glued together word "MapReduce" is supposed to cover a generic concept (distinct from functional programming map/reduce), originating from a conceptual paper from Google. It has an ...
ae1020's user avatar
  • 19
0 votes
1 answer
246 views

Hadoop Error Code 127 - not able to figure out what the actual cause of the error is

Was doing an assignment for class. Running the mapper and reducer codes in my local system went fine and got the desired output. I have a feeling that there's something wrong with Hadoop. Here's the ...
Ganu Matta's user avatar
2 votes
1 answer
282 views

Configure hadoop.service.shutdown.timeout property

I need to configure the value of hadoop.service.shutdown.timeout due to the shutdown hooks triggering a timeout when our MR jobs stop: 2023-08-25 08:44:39,566 [WARN] [Thread-0] [org.apache.hadoop.util....
Evelina Dumitrescu's user avatar
0 votes
0 answers
62 views

MapReduce is working normally and suddenly restarts the process without aborting the Job (Sqoop)

MapReduce is working normally and, suddenly, it restarts the process without aborting the Job (Sqoop), then at the end it duplicates the information in the table?
Edvaldo Lucena's user avatar
1 vote
0 answers
118 views

Why is my sucessful run mapreduce job not showing up in the resource manager web interface(0.0.0.0:8088) as an entry?

Hello I have finished my hadoop cluster installation/configuration. I have run a couple mapreduce tests which are successfully giving back results. However when I try to keep track of them on the ...
Bruno Marques Veiga's user avatar
0 votes
0 answers
26 views

iam new to hadoop mapreduce and face a trouble when submit my mapreduce job from windows to the cluster running on linux

i am setting up a psudo hadoop mapreduce cluster and hbase, and i want to submit a hbase mapreduce job to the cluster, now the cluster is working well and dependency libs is transmited to the ...
PnEndless's user avatar
0 votes
0 answers
39 views

Unable to run hadoop jar - incorrect number of arguments

I am new to Hadoop and followed a WordCount MapReduce example that I found online to run in Ubuntu. I managed to complete all but the final step - i.e. running the job. I uploaded the input file to ...
Wasim_Khan's user avatar
0 votes
1 answer
202 views

Running a hadoop streaming and mapreduce job: PipeMapRed.waitOutputThreads() : subprocess failed with code 1

i am using hadoop 3.3.4 and trying to execute a mapreduce program in python that use google page rank algorithm to rank pages. I'm trying to run this on my own Hadoop cluster. I ran the job using the ...
Tahar Jaafer's user avatar
1 vote
0 answers
29 views

Reduce Is Not Running In MapReduce Hadoop

I try to use MapReduce in Hadoop with Windows 11 and Python. My mapper function complete successfully, but the reducer function does not seem to be running or making progress. I got Map 100% and ...
Aurell Layalia's user avatar
0 votes
1 answer
219 views

Hadoop unable to see file though it exists: "Error: java.io.FileNotFoundException: File does not exist: /user/hadoop/centroids.seq"

I wrote a hadoop program and when I try to execute it (I use this command: hadoop jar kmeans-1.0-SNAPSHOT.jar it.kurapika.Kmeans dataset.txt output) I get the following error: Error: java.io....
Sissi's user avatar
  • 1

1
2 3 4 5
162