Mindmajix

Hadoop Job Operations

Job Operations

Submitting a work flow, coordinator or Bundle Job:-

Capture 15 Submitting Bundle feature is only supported in zones 3.0 or later. Similarly, all Bundle operation features below are supported in zones 3.0 or later

Example:-

$ oozie job – oozie http://local host:11000/ oozie- config job. Properties –submit

Job: 14-20010525161321- oozie –job

Capture 15 The Properties for the job must be provided in a file, either a Java Properties file(.properties) or a Hadoop XML configuration file(.xml) and the file must be specified with the-config option.

Capture 15 The work flow application path must be specified in the file with the oozie.wf. application. path Properties.

Capture 15 The coordinator application path must be specified in the file with the oozie. coord. application. path Properties.

Capture 15 The bundle application path must be specified in the file with the oozie. bundle. application. path Properties. and specified path must be HDFS path.

Capture 15 The job will be created, but it will not be started and will be in preparation status.

Starting a work flow, coordinator job Bundle Job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- start 14-20090525161321- oozie– joe

Capture 15 The start option start a previously submitted work flow job, coordinator job or bundle job that is in preparation status.

Capture 15 After the command is executed, the work flow job will be in RUNNING Status, coordinator job and bundle job will also be in RUNNING Status

Running a work flow, coordinator or Bundle job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- con fig job. properties –run

Job: 15-20090525161321- oozie– Joe

Capture 15 The run option creates and states a work flow job, coordinator job or bundle job

Capture 15 The Parameters for the job and work flow, coordinator and bundle application path are specified same as in submitting method.

Suspending a work flow, coordinator or Bundle job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Suspend
14-20090525161321- oozie– Joe

Capture 15 The Suspend option Suspends a work flow job in RUNNING Status

Capture 15 After the command is executed, the work flow job will be in SUSPEND Status.

Capture 15 The Suspend option Suspends a coordinator/bundle job in RUNNING  WITH ERROR or PREP Status

Capture 15 When the coordinator job is suspended, running coordinator actions will stay in running and the work flow will be in Suspended.

Capture 15 If the coordinator job is in RUNNING Status, it will transit to SUSPEND Status. If it is in RUNNING WITH ERROR Status, it will transit to SUSPEND WITH ERROR and if it is in PREP Status, it will Transit to PRE SUSPEND Status

Capture 15 When the bundle job is suspended, running coordinator will also be suspended.

Capture 15 If the bundle job is in RUNNING Status, it will transit to SUSPENDED Status. If it is in RUNNING WITH ERROR Status, it will transit to SUSPEND WITH ERROR and if it is in PREP Status, it will Transit to PRE SUSPEND Status.

Resuming a work flow, coordinator or Bundle job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Resume
14-20090525161321- oozie– Joe

Capture 15 The Resume option resumes a work flow job in SUSPENDED Status

Capture 15 After the command is executed, the work flow job will be in RUNNING Status.

Killing a work flow, coordinator or Bundle job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Kill
14-20090525161321- oozie– Joe

Capture 15 The Kill option Kills a work flow job in PREP, SUSPENDED  or Status and coordinator or Bundle job in =PREP RUNNING, PREP SUSPENDED, SUSPENDED, or PAUSED Status

Capture 15 After the command is executed, the job will be in KILLED Status.

Changing end time/concurrency/pause time of a coordiantorjob:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Change
14-20090525161321- oozie– Joe –value end time=2011-12-01TOS:

Capture 15 The Change option Changes a coordinator job that is not in KILLED Status

Capture 15 Valid value names are end time, concurrency and pause time.

Capture 15 Repeated value names are not allowed.

Capture 15 New end time must not be before job’s start time and last action time.

Capture 15New concurrency value has to be a valid integer.

Capture 15 After the command is executed. The job’s end time, concurrency or pause time should be changed.

Capture 15 If an already-SUCECEDED job changes, its end time and its Status will keep running.

Changing pause time of a Bundle job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Change
14-20090525161321- oozie– Joe –value pause time=2011-12-01TOS:00Z

Capture 15 The Change option Changes a Bundle job as it is not in KILLED Status

Capture 15 Valid value names are

  • pause time : the pause time of the Bundle job

Capture 15 Repeated value names are not allowed.

Capture 15 After the command is executed, the job’s pause time must be changed.

Rerunning a work flow job:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- Con fig job. properties.
-rerun 14-20090525161321- oozie– joe

Capture 15 The rerun option reruns a completed (SUCCEDED, FAILED, or KILLED) job by skipping the specified nodes.

Capture 15 After the command is executed, the job will be in RUNNING Status

Rerunning a coordinator Action or Multiple Actions:-

Example:-

$ oozie job – rerun<coord-Job-id>[-no cleanup][-refresh][-action1,3-4,7-40](-action or-date is required to rerun)
[-date 2009-01-01T01:ooz:: 2009-05-31 T23: 59z, 2009-11-10T01: ooz, 2009-12-31T22:ooz](if neither- action nor-data is given, the exception will be thrown)

Capture 15 The rerun option reruns a terminated (=TIMEOUT=,SUCCEDED, KILLED,FAILED) coordinator action when coordiantor job is not in FAILED or KILLED state.

Capture 15 After the command is executed, the rerurn coordinator action will be in WAITING Status.

Rerunning a Bundle job:-

Example:-

$ oozie job – rerun< bundle –job-id >[-no cleanup][-refresh][-coordinator c1,c3,c4)( coordinator or –date is required to rerun)
[-date 2009-01-01T01:ooz:: 2009-05-31 T23: 59z, 2009-11-10T01: ooz, 2009-12-31T22:ooz](if neither- coordinator nor –date is give, the exception will be thrown)

Capture 15 The rerun option reruns coordinator action belonging to specified coordinator within the specified data range.

 Capture 15After the command is executed, the rerun coordinator action will be in WAITING Status.

Checking the status of a work flow, coordinator or Bundle Job or a coordinator Action:-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- info 14-2009052511613 21-– oozie -Joe

Work flow Name     :                  map-reduce-wf

App path :   http://local host:8020/user/joe/work flows/mapreduce

Status                       :                  SUCCEDED

Run                           :                  0

User                          :                  Joe

Group                       :                  Users

Created                    :                  2009-05-26      05:01 +0000

Stated                       :                  2009-05-26      05:01 +0000

Ended                       :                  2009-05-26      05:01 +0000

Actions
Actions Name Type Status          Transaction External Id External Status
Hadoop1 Map-reduce ok end Job-20090428135-0524 SUCCEDED
Error code Status         End
2009-05-26

05:01 +0000

 

2009-05-26

05:01 +0000

 

Capture 15 The info option can display information about a Work flow job or coordinator job or coordinator action.

Capture 15 The info command may time out if the number of coordinator actions are very high

Capture 15 In that case, info should be used with offset and lent option,

Capture 15 Offset and lent option specifies the display of offset and number of actions to display if checking a Work flow job or coordinator job

Checking the server logs  of a work flow, coordinator or Bundle Job

Example:-

$ oozie job – oozie http://local host:11000/ oozie- log 14-200905251613 21-– oozie –Joe

Checking  the server logs  of a particular actions of a Coordinator Job :-

Example:-

$ oozie job – log<coord-job-id>[-action 1,3-4,7-40](-action is optional)

Checking the status of multiple work flow Job :-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- localtime -len 2 – filter status- RUNNING

Capture 15 A filter can be specified after all options.

Capture 15 The filter option syntax is : [NAME=VALUE][;NAME=VALUE]*

Capture 15 Valid Filter names are:

name: work flow application name

user: the user that submitted the job

group: the group for the job

status: the status of the job

frequency: frequency of the coordinator job

unit: the time unit which take months, days, hours or minutes values.

Checking the status of multiple coordinator Job :-

Example:-

$ Oozie job – oozie http://localhost:11000/ oozie- job type coordinator

Job ID App Name status Freq Unit Stated Next Materialized

Successes 1440 minute

Checking the status of multiple Bundle Job :-

Example:-

$ oozie job – oozie http://local host:11000/ oozie- job type Bundle
Job ID Bundle Name status kick off creator user group
0000027-110 oozie-chao-B BUNDLE-TEST RUNNING 2012-01-15 00:24 2011-03 Joe users

 

Admin Operations:-

Checking  the status of the oozie system

Example:-

$ oozie admin – oozie http://local host:11000/ oozie- status safe mode: OFF

Capture 15 It returns the current status of the oozie system

Checking the status of the oozie system(in oozie 20 or later)

Example:-

$ oozie admin – oozie http://local host:11000/ oozie- system mode

Safe mode: ON

Capture 15 It returns the current status of the oozie system

Displaying the Build version of the oozie system

Example:-

$ oozie admin – oozie http://local host:11000/ oozie- version
Oozie server Build version: 2.0.2.1-0.20.1.3092118008--

Capture 15 It returns the oozie server Build version.

Validate Operations

Example:-

$ oozie validate my APP/Work flow.xml

Capture 15 It performs an XML schema validation on the specified Work flow xml file.

Pig Operations

Submitting a pig job through HTTP:-

Example:-

$ oozie pig – oozie http://local host:11000/ oozie- file .pig script file
-con fig job. Properties –X –param –file params
Job: 14-2009052515161321-oozie-joe-w
$ cat job. Properties
Fs.default.name= hdfs:/1local host:8020
Map reduces. Job tracker. Kerberos. Principal=ccc
dfs. Name Node. Kerberos. principal= ddd
Oozie. Libpath =hdfs:/1localhost:8020/user/ Oozie/pigl lib/

Capture 15 The parameters for the job must be provided in a Java properties file(.properties).

Capture 15 Job tracker, Name Node, lib path must be specified in this file.

Capture 15 Pig script file is a local file

Capture 15 All jar files including pig jar and all other files needed by the pig job, need to be uploaded on to HDFS under lib path beforehand.

Capture 15 The workflow.xml will be created in Oozie server initially.

Capture 15 The job will be created and run right away.

Map- reduce Operations:-

Submitting a map-reduce job

Example:-

$ oozie map-reduce- oozie http://local host:11000/ oozie- con fig .job. properties.

Capture 15 The parameters must be in the Java properties file. And this file must be specified for a map-reduce job.

Capture 15 The properties file must specify the mapped. Mapper-class, mapred.

Re Run:-

Capture 15 Reloads the config.

Capture 15 Creates a new work flow instance with the same Id

Capture 15 Deletes the actions that are not skipped from the DB and copies data from old work flow insurance to new one for skipped actions.

Capture 15 Action handler will skip the nodes given in the con fig with the same exit transaction as before.

Work flow Re Run:-

Capture 15 Config

Capture 15 Pre- conditions

Capture 15 Reruns.

Config:-

.Oozie. wf. application. Path

Capture 15 Only one of following two configurations is mandatory and both should not be defined at the same time.

Oozie. wf. Return. Fail nodes.
Oozie.wf. rerun .fail nodes

Capture 15 Skip nodes are comma separated list of action names. And they can be any action nodes including decision node.

Capture 15 The valid value of oozie. Wf. Re run. Fail nodes is either true or false

Capture 15 If secured hadoop version is used, the following two properties needs to be specified as well

-dfs. Name Node. Kerberos. Principal
-map reduce. Job tracker. Kerberos. principal

Pre- conditions:-

Capture 15 Work flow with id WFID should exist.

Capture 15 Work flow with id WFID should be Succeeded/Killed/failed.

Capture 15 If specified, nodes in the config oozie.wf. rerun. Skip. Nodes must be completed successfully.


0 Responses on Hadoop Job Operations"

Leave a Message

Your email address will not be published. Required fields are marked *

Copy Rights Reserved © Mindmajix.com All rights reserved. Disclaimer.
Course Adviser

Fill your details, course adviser will reach you.