189 questions
0
votes
1
answer
42
views
Java action in Apache Oozie workflow
I am trying to configure an Apache Oozie workflow to execute different actions depending on the day of the week. After reading https://stackoverflow.com/questions/71422257/oozie-coordinator-get-day-of-...
0
votes
0
answers
16
views
Oozie workflow arguments interprets double quotes weirdly for spark-submit command parameters
I have a Spark job which takes a bunch of configurable parameters. I am facing an issue specifically in this portion:
--conf spark.executor.extraJavaOptions="-Duser.timezone=PST -XX:+UseG1GC -...
0
votes
0
answers
40
views
Spark Oozie job stuck in RUNNING state
I am running a spark job on oozie. This spark job processes some data on S3 and then loads the data into snowflake DWH.
At the end of the code I am calling a spark stop.
import org.apache.log4j....
0
votes
0
answers
30
views
Cannot run program "script.sh" (in directory "/app/hadoop/tmp/nm-local-dir/usercache/nirmalya/appcache/application) error=2, No such file or directory
I have configured hdfs path where my workflow.xml & script.sh file is present.
In Hadoop cluster web UI, I can see job status showing as SUCCEEDED.
But in OOZIE Web console, this error message is ...
0
votes
0
answers
26
views
Incorrect behavior Oozie with increased load. Hanging subsidiaries in the status "RUNNING"
I launch an Oozie workflow with the following structure:
-- Oozie workflow
------> subworkflow_1
---------- fork_1
---------- fork_2
---------- ...
---------- fork_n
------> subworkflow_2
-------...
0
votes
0
answers
43
views
Output data exceeds its limit [20480]
I am currently using Oozie to call Shell script. On base of exit value of the called shell script, Oozie goes to next node which is a decision node .in order to use the decision node I am using ...
0
votes
0
answers
49
views
Send row count of a table in Oozie success email
I have a Pyspark code which is scheduled to run by Oozie. I need to print the total row count of the output table created by this workflow in the Succes email.
I have set this row count as an ...
1
vote
1
answer
132
views
On renaming a column in hive table, it removed all values of that column for its previous data prior to deployment
We just went ahead with a deployment for one of our Hive based table. We renamed our column risk_old to risk_new (renamed). The table is period partitioned. However post deployment, we saw a strange ...
1
vote
1
answer
99
views
Apache Oziee error: org.apache.oozie.action.ActionExecutorException: Action of type spark is not supported
I try to launch spark application by Oozie (Oozie version is 5.2.0).
Spark version 3.0.0, Scala version 2.12.10, Java version 1.8:
I got error:
2023-06-01 14:26:56,393 WARN ActionStartXCommand:523 - ...
0
votes
1
answer
103
views
Oozie action triggered by kafka messages
The task is to realize the following workflow:
Kafka consumer read message from topic with file metadata.
Copy file (specified in metadata) from filesystem (not HDFS) to another filesystem (not HDFS)...
0
votes
1
answer
41
views
Oozie coordinator app, how to configure action triggered by external data source?
I would like to run a job every time when a external data source is updated, for example, some government file is update, http://www.ic.gc.ca/folder/filename.zip. Is there way of doing it?
Please ...
0
votes
0
answers
53
views
FileNotFoundError for a json file on hdfs location when running oozie
I am trying to pass a dictionary of arguments to a script (a spark action) in an oozie.
The dictionary is in a .json file.
I pass the .json file as an argument in workflow.xml but deploying the ...
0
votes
1
answer
109
views
Slack alerts for oozie job status alerting
I am looking to change the existing system of alerting for oozie job statuses from email, to slack alerts.
I haven't found resources online.
The existing method is using: <email xmlns="uri:...
0
votes
1
answer
209
views
Not able to run(schedule) oozie example map reduce job || java.net.ConnectException
I am using Hadoop 2.6.0 and oozie 5.2.0 version.
Trying to run example Map reduce job in oozie but getting below error.
hadoop1@ip-172-31-84-37:/usr/local/oozie-5.2.0/examples/target/examples/apps/map-...
0
votes
1
answer
159
views
Use class variables inside Oozie workflow
I have an oozie workflow that has the following format:
<workflow-app xmlns="uri:oozie:workflow:0.5" name="${componente}">
...
<start to="...
-3
votes
1
answer
738
views
Null message body [closed]
echo test=$( "Message Body " | mail -s "Subject Testing " -a $(ls -dt $PWD/*|head -1) [email protected])
0
votes
1
answer
110
views
oozie intial instance and start time giving error on missing dataset
I am new to oozie and trying to understand dataset.xml. I have following dataset and trying to understand what exactly oozie is trying to validate here. what is the meaning of initial instance and ...
1
vote
1
answer
322
views
Oozie coordinator get day of the week
I am trying to create a condition in my Oozie workflow, where an action should be executed only on mondays (at the end of the workflow).
So far I added a decision node in the workflow, and the current ...
0
votes
1
answer
404
views
Running Python Script in OOzie with special libraries without install them to server
I want to run Python script in OOzie workflow with special libraries. But, I want to run this script without installing these special libraries to Hadoop nodes. I tried to run with virtualenv but ...
0
votes
1
answer
337
views
read hive table in a python script within a shell oozie action
I have the following python script shell_csv.sh running in an oozie shell action:
#! /usr/bin/env python
import csv
import sys
import os
import subprocess
csv.field_size_limit(300000)
with open(r'...
1
vote
3
answers
837
views
in oozei with spark java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
I use CDH 6.3.2
hadoop is HA
I make a workflow with spark in hue
run this workflow I get a error
Failing Oozie Launcher, java.net.UnknownHostException: nameservice1
java.lang.IllegalArgumentException: ...
0
votes
1
answer
1k
views
Oozie variable cannot be resolved
Having an issue to pass a variable in decision Node. The parameter is declared under global config
<global>
<configuration>
<property>
<name&...
1
vote
0
answers
273
views
Set SLA for Oozie job in WAITING status
I have a Coordinator, which has a data dependency on a parquet directory, partitioned by date. And it runs every day in the morning. If the file isn't available for that day, the workflow goes into &...
1
vote
1
answer
751
views
Oozie let other forked actions continue in case one fails but terminate after the join
I have a work-flow that I fork into 3 actions.
<start to="PARALLEL_PROCESS_FORK"/>
<fork name="MY_FORK">
<path start="START_PARALLEL_PATH_1"/>
<path ...
1
vote
1
answer
335
views
Create table name using username in Hive query running in Oozie workflow?
I've got a Hive SQL script/action as part of an Oozie workflow. I'm doing a CREATE TABLE AS SELECT to output the results. I want to name the table using the username plus an appended string (e.g. &...
0
votes
0
answers
189
views
How do I repeatedly run a Hive query using each line of a multi line input as the parameter?
Using Hue, I've got a Hive query that will take an input (eg. an ID number) and return a record based on that. I need to handle multiple numbers to look up in one go (in serial or parallel) and ...
0
votes
2
answers
90
views
oozie workflow throws Socket error but submits the workflow twice after 10 minutes
I am facing very weird issue. I have workflow xml which contains like 20 fork-join nodes and each contain 4-8 actions . When I submits this workflow, It wait for like 5-6 minutes, throws
"Error: ...
0
votes
1
answer
735
views
How to mark an Oozie workflow action's status as OK
I am using Apache oozie. I want to mark the status of one of the shell action as OK, in my oozie workflow. It is in Running state.
Can we please share the command to use in Apache Oozie to do this.
1
vote
1
answer
767
views
Oozie Spark action workflow can not start
I have a simple spark job that can not run via Oozie. The same spark job runs via spark submit. I submit the job workflow and get the following error:
2020-10-06 11:30:05,677 INFO [main] org.apache....
1
vote
0
answers
108
views
Loading xml file and mapping given columns before inserting data into Hive Table
I want to load the XML file into hive columns, but before that I need do some mapping of some fields using a given map values.
Example:
I have an xml file like this:
<?xml version="1.0"?&...
1
vote
2
answers
3k
views
Get spark application id based on oozie job id
I am trying to get spark application id from unix based on oozie id. I am able to get map reduce job id when i try with oozie -info <oozie_id>@<action_name>. How can I get spark ...
1
vote
0
answers
662
views
SLF4J: Class path contains multiple SLF4J bindings.Error
Hi i have multiple SLF4J bindings error when i schedule my spark job in ozzie. some of these bindings are part of the cache diretcory. how do i resolve. Please find the error message below.
SLF4J: ...
0
votes
1
answer
202
views
Why would Oozie fail a job with Error Code LimitExceededException when yarn reports that oozie launcher & mapreduce job have completed successfully?
There are a few questions similar to this on SO. However nothing has worked for me. So I am posting this question.
I am Using CDH 6.2.1
I have a workflow that has map-reduce action. The map-reduce ...
0
votes
1
answer
183
views
Can I use an oozie action as a template that I call many times?
I have a shell oozie action that takes in a number of arguments that get passed to the shell script. I want to trigger that action multiple times with different arguments each time. An example dag ...
0
votes
1
answer
166
views
How to point centralized location for multiple workflows in oozie
I have more than 10 oozie workflows. Each workflow.xml, coordinator.properties and xml plus lib folder is in a separate folder. All the workflow have some common jars around 6mb size and I have to ...
0
votes
1
answer
245
views
Running Oozie Action on a future date
I have a requirement for which a workflow is on demand run.But there is a task ( curl command) to get triggered at a future time .
0
votes
0
answers
43
views
Input split in Haddop Streaming
I am using hadoop streaming via a python script to attach prediction scores to data using a pre-trained model. My data is organized as follows:
/user/..../idType=id1/part-0000.csv
/user/..../...
0
votes
2
answers
59
views
Oozie property file value not reading from spark
I have a property file in oozie and getting value from shell script like below:
filter_cond = record = 'n' and name = 'abc' and age = '14'
in Shell script
val cond = ${getproperty filter_cond}
...
2
votes
1
answer
751
views
How to use GCS bucket as workflow file source for Oozie in Dataproc
We're migrating our EMR cluster to Dataproc, and we're relying on Oozie to run our workflows. The first challenges is how to load the workflow.xml from Cloud Storage bucket. We used to do it using S3:
...
0
votes
1
answer
294
views
OOZIE Spark Action : Getting No such method error sometimes
I'm getting this exception while executing spark action via ooze. Some time job runs fine and some time I get this exception. Really weird, not sure why it's happening.
I have check versions of spark ...
1
vote
1
answer
958
views
Oozie - run a workflow every day or every hour
I have a oozie workflow(hive_insertion.xml) that executes a .hive file, which inserts data into a table.
The Oozie workflow is:
<workflow-app xmlns = "uri:oozie:workflow:0.4" name = "simple-...
0
votes
1
answer
663
views
Does Oozie support decision node to call either fork-join or single action node?
I am trying to have a workflow where, based upon a variable, either the full fork-join runs or just a single action runs. I'm getting an error saying no fork for join to pair with. Is this supported? ...
1
vote
0
answers
120
views
how to check oozie and Expression Language Version?
How to check for which oozie version and which oozie Expression Language (EL) version I am using? This is needed to use appropriate EL expression in my oozie workflow.
1
vote
0
answers
731
views
Oozie - EL_ERROR: cannot convert String to type Double
Getting oozie EL_ERROR. Please find below the oozie workflow details. Please advise.
Error Code: EL_ERROR
Error Message: An exception occured trying to convert String "/tmp/dir" to type "java.lang....
0
votes
0
answers
44
views
EL within EL in oozie workflow
Getting EL_ERROR with below oozie workflow code snippet at line # 3 and 4. Please advise.
1. <decision name="chooseFruit">
2. <switch>
3. <case to="apple">${fs:dirSize(${...
-1
votes
1
answer
240
views
Get oozie job information in oozie workflow by REST
How can I find jobs with the parent id is null? I tried 3 methods but none of them worked for me.
/oozie/v1/jobs?jobtype=wf&filter=parent_id=%00 NOT WORKING
/oozie/v1/jobs?jobtype=wf&filter=...
0
votes
0
answers
277
views
How to design the Oozie coordinator on arrival of input multiple times in a day
I have a requirement to schedule my coordinator on arrival of input from other application. I may receive one or multiple times in a day. So, whenever I receive an input I need to trigger my treatment....
3
votes
0
answers
494
views
Oozie Job log unable to view println statements
I have a few System.out.println() statements in my Java code that I am trying to view in the job scheduled using Oozie. I'm unable to find those println() statements despite the Job status showing as ...
0
votes
1
answer
681
views
Access Oozie Configuration in Spark program
I have an environment variable saved in my .bash_profile. I am trying to access it via a Spark program using sys.env() method in Scala. When I don't have Oozie Scheduling, I am able to access the ...
0
votes
0
answers
1k
views
oozie - java.lang.NoClassDefFoundError: Could not initialize class java.net.NetworkInterface
While running the oozie sample examples, Oozie jobs are getting scheduled and status showing as Running. After some times jobs get KILLED. While digging the hadoop logs found these exceptions.
I have ...