All Questions
Tagged with amazon-emr hadoop-yarn
230 questions
0
votes
1
answer
379
views
AWS EMR YARN does not allocate all requested executors
The situation:
1x Primary node:
4-cores
8GiB memory
2x core nodes:
16-cores
-64GiB memory
I tried to request 6 executors, each running 5 cores, so that my application would utilise 30 cores total. ...
0
votes
1
answer
78
views
Exception NoClassDefFoundError JniBasedUnixGroupsMapping while upgrading Drill version to > 1.14.0
For our drill cluster, installed on Hadoop EMR core nodes, we use PAM based authentication. This is the below configuration.
security.user.auth: {
enabled: true,
packages += &...
0
votes
0
answers
374
views
Spark streaming job failing with exit code 134 after i point spark event logs to S3
Environment
EMR version:- emr-5.30.0
Spark version:- 2.4.5
Hadoop version:- 2.8.5
Did the following steps for pointing the spark event logs to S3:-
STEP 1:
sudo nano /etc/spark/conf/spark-defaults....
1
vote
1
answer
319
views
Confounded logging, missing application log messages - AWS EMR, spark-submit, yarn cluster mode, log4j
I am running a spark-submit (version 3.2.1-amzn-0) job that executes Scala code (RandomForest) we've written on Amazon EMR 6.7.
spark-submit ... --class ...RandomForest \
--conf spark.driver....
0
votes
1
answer
2k
views
Role of command-runner.jar and script-runner.jar in aws emr
When we execute a spark job in emr cluster,we add step as
'HadoopJarStep': {
'Args': [
'spark-submit',
's3://spark-test-bucket-pr/spark_job/...
0
votes
1
answer
455
views
Shuffle logs filling disk in EMR task nodes
I have Spark 3 job running on EMR 6.9 and it is continuously running job. I am noticing gradual increase in disk usage of task nodes over time. I have noticed errors like this on the task nodes -
2023-...
0
votes
1
answer
278
views
Creating a 50Giga parquet file of random integers using pyspark fails
I've tried using different sizes of clusters (EMR on AWS) and it always fails due to YARN killing all the nodes:
https://aws.amazon.com/premiumsupport/knowledge-center/emr-exit-status-100-lost-node/
I ...
0
votes
1
answer
162
views
Submitting Multiple Jobs in Sequence
I'm having some trouble understanding how Spark allows for scheduling of jobs. I have a series of jobs I'd like to run in sequence. From what I've read, I can submit any number of jobs to spark-submit ...
0
votes
1
answer
220
views
YARN add new queue or clear default queue
I'm running YARN on an EMR cluster.
mapred queue -list returns:
Queue Name : default
Queue State : running
Scheduling Info : Capacity: 100.0, MaximumCapacity: 100.0, CurrentCapacity: 0.0
How do I ...
0
votes
2
answers
416
views
How to kill an EMR task programatically
I want to programtically kill an EMR streaming task. If I kill it from EMR UI or boto client, it disappears in EMR, but it is still active in the Hadoop cluster (see this article). Only if I go ...
1
vote
0
answers
354
views
AWS EMR - how to get driver and worker node IP via CLI?
I can get the master node IP with this:
aws emr describe-cluster --output text --cluster-id $(aws emr list-clusters --active --query 'Clusters[?Name==`ClusterName`].Id' --output text) --query ...
0
votes
0
answers
65
views
pyspark conf and yarn top memory discrepancies
An EMR cluster reads (from main node, after running yarn top):
ARN top - 13:27:57, up 0d, 1:34, 1 active users, queue(s): root
NodeManager(s): 6 total, 6 active, 0 unhealthy, 2 decommissioned, 0
lost,...
0
votes
1
answer
181
views
Unable to create Scheduler Pools in EMR using PySpark
I am fairly new to the concept of Spark Schedulers/Pooling and need to implement the same in one of my Projects. Just in order to understand the concept better, I scribbled the following streaming ...
0
votes
1
answer
172
views
YARN Schedulers - Fair Scheduler - Running Jobs specifying Queue
How do we assign Jobs to specifying Queue when we have multiple queues. I'm using Yarn hadoop with AWS EMR
1
vote
0
answers
321
views
Spark on EMR | EKS |YARN
I am getting migrated from on-premise to the AWS stack. I have a doubt and often confused about how Apache spark works in AWS/similar.
I will just share my current understanding about the onpremise ...
2
votes
1
answer
414
views
EMR, Spark: proper place for a local shared cache
In our Spark application, we store the local application cache in /mnt/yarn/app-cache/ directory, which is shared between app containers on the same ec2 instance
/mnt/... is chosen because it is a ...
-1
votes
1
answer
579
views
Spark is inconsistent with unusually encoded CSV file
Context:
As part of data pipeline, I am working on some flat CSV files
Those files have unusual encoding and escaping rules
My intention is too preprocess those and convert to parquets for subsequent ...
0
votes
0
answers
731
views
Flink application TaskManager timeout exception Flink 1.11.2 running on EMR 6.2.1
We are currently running a Flink application on EMR 6.2.1 which runs on Yarn. The flink version is 1.11.2
We're running at a parallelism of 180 with 65 task nodes in EMR. When we start up the yarn ...
0
votes
0
answers
476
views
I am not able to run dask yarn cluster on AWS EMR
I want run dask on EMR using YarnCluster.
I have used below bootstrap script but I have run these instructions in SSH console.
#!/bin/bash
HELP="Usage: bootstrap-dask [OPTIONS]
Example AWS EMR ...
0
votes
1
answer
1k
views
Airflow emrAddStepsOperator unable to execute spark shaded jar
what should be in step type for spark app .. I am facing issue that master type not set or unable to recognize yarn .. seems it is considering the application as simple jar rather than spark submit ...
1
vote
0
answers
76
views
install github on yarn in EMR cluster
I tried many times to install the git on the yarn with the EMR cluster.
But It also has the error like below. And for the S3 bucket, it was also hard to cache.
0
votes
1
answer
814
views
My spark jobs are staying long time in accepted mode on aws EMR cluster
My spark jobs are staying long time in accepted mode on aws EMR cluster. Previously my spark job staying less time in accepted mode now it is increased. Below are some of the configs that I am using ...
1
vote
1
answer
329
views
How to Show Dask Dashboard Link When Submitting Dask-Yarn Job Remotely?
Problem
Would anyone happen to know how to retrieve the dask dashboard link when I submit my dask-yarn job? I have a print statement for displaying the dask dashboard link, but it doesn't show up in ...
0
votes
1
answer
419
views
java.lang.ArrayIndexOutOfBoundsException: 1 while saving data frame in spark Scala
In EMR, we are using Salesforce Bulk API call to fetch records from salesforce object. For one of the Object(TASK) data frame while saving to parquet getting below error.
java.lang....
1
vote
0
answers
542
views
EMR UI shows the job is still running but Yarn shows the job is completed and this happening intermittently
After a spark job completion(spark job is able to upload the files to S3 successfully), Yarn shows the job is completed in Yarn UI, but the EMR shows the step is still running (in AWS EMR console) and ...
1
vote
0
answers
336
views
Hadoop MapReduce job container throws java.io.FileNotFoundException for path /tmp/crunch-1324412807/p2/MAP
I am seeing the below error when I submit an Oozie job in an EMR Hadoop cluster. I could see that a certain container is not finding the temporary output file that is produced by another job (?).
I ...
0
votes
1
answer
740
views
Wrong Yarn node label mapping with AWS EMR machine types
Does anyone have experience with Yarn node labels on AWS EMR? If so you please share your thoughts. We want to run All the Spark executors on Task(Spot) machine and all the Spark ApplicationMaster/...
0
votes
0
answers
339
views
Delay in application submission from Oozie and Yarn
We are running a Oozie workflow which has Shell action and a Spark action which means a shell script and a Spark job which runs in sequence.
Running single workflow:
Total: 3 mins
Shell action: 50 ...
0
votes
1
answer
644
views
AWS EMR Containers not using all available cores
I have an EMR Cluster that has correctly spawned 6 executors, 4 cores each.
When the spark job is run on the cluster, it creates 6 Containers, which are each only assigned 1 core, How do i specify the ...
1
vote
1
answer
152
views
Why does a Spark Stage with Chained withColumn window aggregations keep running OOM even with smaller partitions?
I have a Stage in spark Job that contains a long chain of window aggregations that keep failing no matter how many partitions I add.
My cluster configuration is a ```48 Node(r5.2xlarge) EMR Cluster ...
3
votes
0
answers
385
views
Why does AWS EMR EC2 instances have default spark.yarn.executor.memoryOverhead set to 18.75%
Amazon EMR default memory limits for Spark executors
With the Spark on YARN configuration option which was introduced in EMR version 5.22: spark.yarn.executor.memoryOverheadFactor and defaults to 0....
0
votes
1
answer
714
views
How to add a custom node label to task node in EMR
I want to run my spark executors on task nodes only in my AWS EMR cluster and yarn labels are one of the ways to achieve this. I can specify labels during spark-submit. I want to achieve the following
...
1
vote
1
answer
5k
views
List yarn application jobs
I'm working on AWS environment and I'm trying to get this application infos via CLI, but I didn't find any YARN command for that:
EMR Application Jobs:
Any idea?
6
votes
2
answers
948
views
Hadoop YARN: How to force a Node to be Marked "LOST" instead of "SHUTDOWN"?
I'm troubleshooting YARN application failures that happen when nodes are LOST, so I'm trying to recreate this scenario. But I'm only able to force nodes to be SHUTDOWN instead of LOST. I'm using AWS ...
1
vote
1
answer
628
views
Getting import error while executing statements via livy sessions with EMR
I am trying to post statements to livy session with EMR 6.1.0. But i am unable to import the class(to my custom jar) which i am trying to execute.
Statement I am trying to post to a livy session -
...
1
vote
1
answer
955
views
CANCELing a YARN step in EMR
I have a long running YARN application running on EMR cluster.
Based on Canceling EMR Steps, the running steps can be canceled with command
aws emr cancel-steps as long as Amazon EMR versions 5.28.0 ...
2
votes
3
answers
2k
views
Where to find node logs in AWS EMR cluster?
I have pyspark program running on AWS EMR cluster.
Cluster config is like this - emr-5.31.0, hadoop 2.10.0, hive 2.3.7, hue 4.7.1, pig 0.17.0.
Program processes some files on hdfs file system but at ...
1
vote
1
answer
281
views
Flink - unable to recover after yarn node termination
We are running flink on yarn. We were performing Disaster recovery Testing and as part of that, we manually terminated one of the nodes that had a flink application running. Once the instance was ...
0
votes
1
answer
190
views
Why my Yarn shows less memory (12 GB) than host machine (32) GB
I am using EMR and I have task nodes with 32 GB of memory. However when I login to my YARN UI. it says it has only 12 GB of memoery.
Yes, I understand some memory should be used by OS and other ...
1
vote
0
answers
645
views
Spark, YARN, and AWS EMR: INFO ApplicationMaster: Final app status: FAILED, exitCode: 11, (reason: Due to executor failures all available node
I have an AWS EMR cluster running Spark 2.4.4. I'm running a monthly data conversion process with Pyspark, and have never had an issue with it, but today I'm hitting a new error I've never seen before:...
0
votes
1
answer
1k
views
aws emr pyspark stuck on collect call
I am trying to learn to setup pyspark on aws emr. However, the sample job I am running is stuck in collect api call. I am using emr version 5-30.1.
I dont see any relevant logs associated with this. ...
3
votes
0
answers
921
views
Spark on Yarn Number of Cores in EMR Cluster
I have an Emr cluster for spark with below configuration of 2 Instances.
r4.2xlarge
8 vCore
So my total vCores is 16 and the same is reflected in yarn Vcores
I have submitted a spark streaming job ...
0
votes
0
answers
696
views
How do I use the portable runner and spark-submit to submit beams wordcount python example to a remote spark cluster on EMR running yarn?
I am trying to submit beams wordcount python example to a remote spark cluster on emr running yarn as its resource manager. According to the spark documentation this needs to be done using the ...
0
votes
1
answer
2k
views
How to control number of container in Hive-On-Tez
I'm new to using Tez engine. I'm running hive queries on Tez engine, and the query seems to utilize all the available resource. I'd like to know if there is any way to control the number of running ...
0
votes
1
answer
165
views
EMR too much worker memory usage
from dask_yarn import YarnCluster
from dask.distributed import Client
# Create a cluster where each worker has two cores and eight GiB of memory
cluster = YarnCluster(environment='s3://openbank-ds-...
1
vote
2
answers
2k
views
EMR Disable node label
I'm newly trying out EMR. By default, the EMR AMI 5.28.0 seems to label the nodes (CORE, DEFAULT) and the yarn application master seems to run in CORE label.
How to reconfigure it so that the ...
0
votes
2
answers
909
views
Cannot ssh into Spark worker
There are 8 failed tasks in a particular executor. I want to connect to it via ssh to view the yarn logs.
The executor address is: ip-123-45-6-78.us-west-2.compute.internal:34265
I've tried both:
ssh ...
1
vote
1
answer
208
views
Sorting a big DataFrame with many columns using spark, causes exceeds spark.driver.maxResultSize
When running the following query:
spark.read.parquet("hdfs:///mydataframe").orderBy('a, 'b, 'timestamp).show(100, false)
my spark job fails, with the following exception:
org.apache.spark....
3
votes
1
answer
864
views
Run beam pipeline with flink yarn session on EMR
I am trying to run a basic wordcount beam pipeline from the python SDK with a flink yarn session on AWS EMR. I have used both the flink runner and portable runner and get two different errors listed ...
0
votes
0
answers
149
views
aws emr hive insert statement giving error as "The auxService:mapreduce_shuffle does not exist"
Below is the simple insert stmt i'm running on aws emr 5.28.1 version with hive 2.3.6-amzn-0 and hadoop 2.8.5. I have tried this change in yarn-site.xml auxService:mapreduce_shuffle does not exist on ...