Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
42 views

Java action in Apache Oozie workflow

I am trying to configure an Apache Oozie workflow to execute different actions depending on the day of the week. After reading https://stackoverflow.com/questions/71422257/oozie-coordinator-get-day-of-...
Lorenzo Panebianco's user avatar
0 votes
0 answers
16 views

Oozie workflow arguments interprets double quotes weirdly for spark-submit command parameters

I have a Spark job which takes a bunch of configurable parameters. I am facing an issue specifically in this portion: --conf spark.executor.extraJavaOptions="-Duser.timezone=PST -XX:+UseG1GC -...
Zaid Khan's user avatar
  • 826
0 votes
0 answers
40 views

Spark Oozie job stuck in RUNNING state

I am running a spark job on oozie. This spark job processes some data on S3 and then loads the data into snowflake DWH. At the end of the code I am calling a spark stop. import org.apache.log4j....
The Beast's user avatar
  • 133
0 votes
0 answers
30 views

Cannot run program "script.sh" (in directory "/app/hadoop/tmp/nm-local-dir/usercache/nirmalya/appcache/application) error=2, No such file or directory

I have configured hdfs path where my workflow.xml & script.sh file is present. In Hadoop cluster web UI, I can see job status showing as SUCCEEDED. But in OOZIE Web console, this error message is ...
Nirmalya Sarkar's user avatar
0 votes
0 answers
26 views

Incorrect behavior Oozie with increased load. Hanging subsidiaries in the status "RUNNING"

I launch an Oozie workflow with the following structure: -- Oozie workflow ------> subworkflow_1 ---------- fork_1 ---------- fork_2 ---------- ... ---------- fork_n ------> subworkflow_2 -------...
Alevtina's user avatar
  • 145
0 votes
0 answers
43 views

Output data exceeds its limit [20480]

I am currently using Oozie to call Shell script. On base of exit value of the called shell script, Oozie goes to next node which is a decision node .in order to use the decision node I am using ...
Monika's user avatar
  • 55
0 votes
0 answers
49 views

Send row count of a table in Oozie success email

I have a Pyspark code which is scheduled to run by Oozie. I need to print the total row count of the output table created by this workflow in the Succes email. I have set this row count as an ...
Anusha Ravindra's user avatar
1 vote
1 answer
132 views

On renaming a column in hive table, it removed all values of that column for its previous data prior to deployment

We just went ahead with a deployment for one of our Hive based table. We renamed our column risk_old to risk_new (renamed). The table is period partitioned. However post deployment, we saw a strange ...
Toxicboy's user avatar
1 vote
1 answer
99 views

Apache Oziee error: org.apache.oozie.action.ActionExecutorException: Action of type spark is not supported

I try to launch spark application by Oozie (Oozie version is 5.2.0). Spark version 3.0.0, Scala version 2.12.10, Java version 1.8: I got error: 2023-06-01 14:26:56,393 WARN ActionStartXCommand:523 - ...
Nikasyauskas's user avatar
0 votes
1 answer
103 views

Oozie action triggered by kafka messages

The task is to realize the following workflow: Kafka consumer read message from topic with file metadata. Copy file (specified in metadata) from filesystem (not HDFS) to another filesystem (not HDFS)...
Jelly's user avatar
  • 1,296
0 votes
1 answer
41 views

Oozie coordinator app, how to configure action triggered by external data source?

I would like to run a job every time when a external data source is updated, for example, some government file is update, http://www.ic.gc.ca/folder/filename.zip. Is there way of doing it? Please ...
Yan's user avatar
  • 1
0 votes
0 answers
53 views

FileNotFoundError for a json file on hdfs location when running oozie

I am trying to pass a dictionary of arguments to a script (a spark action) in an oozie. The dictionary is in a .json file. I pass the .json file as an argument in workflow.xml but deploying the ...
Roopanjali Jasrotia's user avatar
0 votes
1 answer
109 views

Slack alerts for oozie job status alerting

I am looking to change the existing system of alerting for oozie job statuses from email, to slack alerts. I haven't found resources online. The existing method is using: <email xmlns="uri:...
Franklin Rajasekar's user avatar
0 votes
1 answer
209 views

Not able to run(schedule) oozie example map reduce job || java.net.ConnectException

I am using Hadoop 2.6.0 and oozie 5.2.0 version. Trying to run example Map reduce job in oozie but getting below error. hadoop1@ip-172-31-84-37:/usr/local/oozie-5.2.0/examples/target/examples/apps/map-...
Sai 's user avatar
  • 1
0 votes
1 answer
159 views

Use class variables inside Oozie workflow

I have an oozie workflow that has the following format: <workflow-app xmlns="uri:oozie:workflow:0.5" name="${componente}"> ... <start to="...
Sadegh's user avatar
  • 412
-3 votes
1 answer
738 views

Null message body [closed]

echo test=$( "Message Body " | mail -s "Subject Testing " -a $(ls -dt $PWD/*|head -1) [email protected])
Arshiya Fathima's user avatar
0 votes
1 answer
110 views

oozie intial instance and start time giving error on missing dataset

I am new to oozie and trying to understand dataset.xml. I have following dataset and trying to understand what exactly oozie is trying to validate here. what is the meaning of initial instance and ...
Bab's user avatar
  • 181
1 vote
1 answer
322 views

Oozie coordinator get day of the week

I am trying to create a condition in my Oozie workflow, where an action should be executed only on mondays (at the end of the workflow). So far I added a decision node in the workflow, and the current ...
ludehon's user avatar
  • 21
0 votes
1 answer
404 views

Running Python Script in OOzie with special libraries without install them to server

I want to run Python script in OOzie workflow with special libraries. But, I want to run this script without installing these special libraries to Hadoop nodes. I tried to run with virtualenv but ...
Onur Tekir's user avatar
0 votes
1 answer
337 views

read hive table in a python script within a shell oozie action

I have the following python script shell_csv.sh running in an oozie shell action: #! /usr/bin/env python import csv import sys import os import subprocess csv.field_size_limit(300000) with open(r'...
Stella's user avatar
  • 69
1 vote
3 answers
837 views

in oozei with spark java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1

I use CDH 6.3.2 hadoop is HA I make a workflow with spark in hue run this workflow I get a error Failing Oozie Launcher, java.net.UnknownHostException: nameservice1 java.lang.IllegalArgumentException: ...
ighack's user avatar
  • 31
0 votes
1 answer
1k views

Oozie variable cannot be resolved

Having an issue to pass a variable in decision Node. The parameter is declared under global config <global> <configuration> <property> <name&...
abhijit nag's user avatar
1 vote
0 answers
273 views

Set SLA for Oozie job in WAITING status

I have a Coordinator, which has a data dependency on a parquet directory, partitioned by date. And it runs every day in the morning. If the file isn't available for that day, the workflow goes into &...
endless's user avatar
  • 3,443
1 vote
1 answer
751 views

Oozie let other forked actions continue in case one fails but terminate after the join

I have a work-flow that I fork into 3 actions. <start to="PARALLEL_PROCESS_FORK"/> <fork name="MY_FORK"> <path start="START_PARALLEL_PATH_1"/> <path ...
user1485864's user avatar
1 vote
1 answer
335 views

Create table name using username in Hive query running in Oozie workflow?

I've got a Hive SQL script/action as part of an Oozie workflow. I'm doing a CREATE TABLE AS SELECT to output the results. I want to name the table using the username plus an appended string (e.g. &...
Alex Kerr's user avatar
  • 1,050
0 votes
0 answers
189 views

How do I repeatedly run a Hive query using each line of a multi line input as the parameter?

Using Hue, I've got a Hive query that will take an input (eg. an ID number) and return a record based on that. I need to handle multiple numbers to look up in one go (in serial or parallel) and ...
Alex Kerr's user avatar
  • 1,050
0 votes
2 answers
90 views

oozie workflow throws Socket error but submits the workflow twice after 10 minutes

I am facing very weird issue. I have workflow xml which contains like 20 fork-join nodes and each contain 4-8 actions . When I submits this workflow, It wait for like 5-6 minutes, throws "Error: ...
Spark_user's user avatar
0 votes
1 answer
735 views

How to mark an Oozie workflow action's status as OK

I am using Apache oozie. I want to mark the status of one of the shell action as OK, in my oozie workflow. It is in Running state. Can we please share the command to use in Apache Oozie to do this.
akki's user avatar
  • 91
1 vote
1 answer
767 views

Oozie Spark action workflow can not start

I have a simple spark job that can not run via Oozie. The same spark job runs via spark submit. I submit the job workflow and get the following error: 2020-10-06 11:30:05,677 INFO [main] org.apache....
OzzMaster's user avatar
1 vote
0 answers
108 views

Loading xml file and mapping given columns before inserting data into Hive Table

I want to load the XML file into hive columns, but before that I need do some mapping of some fields using a given map values. Example: I have an xml file like this: <?xml version="1.0"?&...
AmidOV's user avatar
  • 81
1 vote
2 answers
3k views

Get spark application id based on oozie job id

I am trying to get spark application id from unix based on oozie id. I am able to get map reduce job id when i try with oozie -info <oozie_id>@<action_name>. How can I get spark ...
techie's user avatar
  • 363
1 vote
0 answers
662 views

SLF4J: Class path contains multiple SLF4J bindings.Error

Hi i have multiple SLF4J bindings error when i schedule my spark job in ozzie. some of these bindings are part of the cache diretcory. how do i resolve. Please find the error message below. SLF4J: ...
vignesh asokan's user avatar
0 votes
1 answer
202 views

Why would Oozie fail a job with Error Code LimitExceededException when yarn reports that oozie launcher & mapreduce job have completed successfully?

There are a few questions similar to this on SO. However nothing has worked for me. So I am posting this question. I am Using CDH 6.2.1 I have a workflow that has map-reduce action. The map-reduce ...
hba's user avatar
  • 7,750
0 votes
1 answer
183 views

Can I use an oozie action as a template that I call many times?

I have a shell oozie action that takes in a number of arguments that get passed to the shell script. I want to trigger that action multiple times with different arguments each time. An example dag ...
Jared DuPont's user avatar
0 votes
1 answer
166 views

How to point centralized location for multiple workflows in oozie

I have more than 10 oozie workflows. Each workflow.xml, coordinator.properties and xml plus lib folder is in a separate folder. All the workflow have some common jars around 6mb size and I have to ...
Kalpesh's user avatar
  • 704
0 votes
1 answer
245 views

Running Oozie Action on a future date

I have a requirement for which a workflow is on demand run.But there is a task ( curl command) to get triggered at a future time .
byomjan's user avatar
  • 119
0 votes
0 answers
43 views

Input split in Haddop Streaming

I am using hadoop streaming via a python script to attach prediction scores to data using a pre-trained model. My data is organized as follows: /user/..../idType=id1/part-0000.csv /user/..../...
Neel Shah's user avatar
  • 329
0 votes
2 answers
59 views

Oozie property file value not reading from spark

I have a property file in oozie and getting value from shell script like below: filter_cond = record = 'n' and name = 'abc' and age = '14' in Shell script val cond = ${getproperty filter_cond} ...
User6006's user avatar
  • 607
2 votes
1 answer
751 views

How to use GCS bucket as workflow file source for Oozie in Dataproc

We're migrating our EMR cluster to Dataproc, and we're relying on Oozie to run our workflows. The first challenges is how to load the workflow.xml from Cloud Storage bucket. We used to do it using S3: ...
Bruno Moreira's user avatar
0 votes
1 answer
294 views

OOZIE Spark Action : Getting No such method error sometimes

I'm getting this exception while executing spark action via ooze. Some time job runs fine and some time I get this exception. Really weird, not sure why it's happening. I have check versions of spark ...
Aryan087's user avatar
  • 526
1 vote
1 answer
958 views

Oozie - run a workflow every day or every hour

I have a oozie workflow(hive_insertion.xml) that executes a .hive file, which inserts data into a table. The Oozie workflow is: <workflow-app xmlns = "uri:oozie:workflow:0.4" name = "simple-...
Naveen Reddy Marthala's user avatar
0 votes
1 answer
663 views

Does Oozie support decision node to call either fork-join or single action node?

I am trying to have a workflow where, based upon a variable, either the full fork-join runs or just a single action runs. I'm getting an error saying no fork for join to pair with. Is this supported? ...
mathfish's user avatar
  • 194
1 vote
0 answers
120 views

how to check oozie and Expression Language Version?

How to check for which oozie version and which oozie Expression Language (EL) version I am using? This is needed to use appropriate EL expression in my oozie workflow.
Vasanth Subramanian's user avatar
1 vote
0 answers
731 views

Oozie - EL_ERROR: cannot convert String to type Double

Getting oozie EL_ERROR. Please find below the oozie workflow details. Please advise. Error Code: EL_ERROR Error Message: An exception occured trying to convert String "/tmp/dir" to type "java.lang....
Tharun Veeraamgari's user avatar
0 votes
0 answers
44 views

EL within EL in oozie workflow

Getting EL_ERROR with below oozie workflow code snippet at line # 3 and 4. Please advise. 1. <decision name="chooseFruit"> 2. <switch> 3. <case to="apple">${fs:dirSize(${...
Vasanth Subramanian's user avatar
-1 votes
1 answer
240 views

Get oozie job information in oozie workflow by REST

How can I find jobs with the parent id is null? I tried 3 methods but none of them worked for me. /oozie/v1/jobs?jobtype=wf&filter=parent_id=%00 NOT WORKING /oozie/v1/jobs?jobtype=wf&filter=...
Gianluca's user avatar
0 votes
0 answers
277 views

How to design the Oozie coordinator on arrival of input multiple times in a day

I have a requirement to schedule my coordinator on arrival of input from other application. I may receive one or multiple times in a day. So, whenever I receive an input I need to trigger my treatment....
Sekhar's user avatar
  • 689
3 votes
0 answers
494 views

Oozie Job log unable to view println statements

I have a few System.out.println() statements in my Java code that I am trying to view in the job scheduled using Oozie. I'm unable to find those println() statements despite the Job status showing as ...
Sparker0i's user avatar
  • 1,851
0 votes
1 answer
681 views

Access Oozie Configuration in Spark program

I have an environment variable saved in my .bash_profile. I am trying to access it via a Spark program using sys.env() method in Scala. When I don't have Oozie Scheduling, I am able to access the ...
Sparker0i's user avatar
  • 1,851
0 votes
0 answers
1k views

oozie - java.lang.NoClassDefFoundError: Could not initialize class java.net.NetworkInterface

While running the oozie sample examples, Oozie jobs are getting scheduled and status showing as Running. After some times jobs get KILLED. While digging the hadoop logs found these exceptions. I have ...
Kamal Malik's user avatar