1. Introduction to Autosys
AutoSys is an automated job control system for scheduling, monitoring, and reporting. These jobs can be a UNIX script, java program or any other program which can be invoked from shell. An AutoSys job is any single command, executable, script, or Unix batch file. Each AutoSys job definition contains a variety of qualifying attributes, including the conditions specifying when and where a job should be run.
1.1 AutoSys System components
· Event server (AutoSys database)
· Event processor
· Remote agent
The Event Server is a AutoSys database which stores all system information and events as well as all job, monitor, and report definitions. Sometimes this database is also called as a data server, which actually describes a server instance. That is, it is either a UNIX or Windows process, and it is associated data space (or raw disk storage), that can include multiple databases or table spaces
The Event Processor is main component of the autosys system. This processes all the events it reads from data server. The event processor is the program, running either as a UNIX process or as a Windows service that actually runs AutoSys. It schedules and starts jobs. When you start the event processor it continually scans the database for events to be processed. When it finds one, it checks whether the event satisfies the starting conditions for any job in the database.
The Remote Agent is a temporary process started by the event processor to perform a specific task on a remote (client) machine. The remote agent is a Windows service running on a remote (client) machine that is directed by the event processor to perform specific tasks.
The remote agent starts the command specified for a given job, sends running and completion information about a task to the event server, then exits. If the remote agent is unable to transfer the information, it waits and tries again until it can successfully communicate with the database.
1.1.4 Basic functionality of AutoSys
Below is the diagram which explains the basic functionality, please check the explanation.
Explanation for the Diagram
1. The event processor scans the event server for the next event to process. If no event is ready, the event processor scans again in five seconds.
2. The event processor reads from the event server that an event is ready. If the event is a STARTJOB event, the job definition and attributes are retrieved from the Event Server, including the command and the pointer (full path name on the client machine) to the profile file to be used for the job. In addition, for jobs running on Windows machines, the event processor retrieves from the database the user IDs and passwords required to run the job on the client machine.
3. The event processor processes the event. If the event is a STARTJOB, the event processor attempts to establish a connection with the remote agent on the client machine, and passes the job attributes to the client machine.
The event processor sends a CHANGE_STATUS event marking in the event server that the job is in STARTING state.
4. On a UNIX machine, the inetd invokes the remote agent. On a Windows machine, the remote agent logs onto the machine as the user defined as the job’s owner, using the user IDs and passwords passed to it from the event processor.
5. The remote agent sends an acknowledgment back to the event processor indicating that it has received the job parameters. The socket connection is terminated. At this point, the event processor resumes scanning the event server database, looking for events to process.
6. The remote agent starts a process and executes the command in the job definition.
7. The remote agent issues a CHANGE_STATUS event marking in the event server that the job is in RUNNING state.
8. The client job process runs to completion, then returns an exit code to the remote agent and quits.
There are the three methods you can use to create job definitions:
- By Autosys Web interface
- By AutoSys Graphical User Interface (GUI).
- By using AutoSys Job Information Language (JIL) through a command-line interface.
Job definitions are created by using the GUI Control Panel. The fields in the GUIs correspond to the AutoSys JIL sub-commands and attributes. In addition, from the GUI Control Panel, you can open applications that allow you to define calendars, monitors, and reports, and that allow you to monitor and manage AutoSys jobs.
JIL stands for Job Information Language is a scripting language which provides a way to specify how AutoSys jobs should behave. These information saved in autosys database. You can also create a jil file which contains job definition. You can then pass this jil file to autosys. JIL scripts contain one or more JIL sub-commands and one or more attribute statements
2.2.1 Essential attributes for creating a Job Definition
When using JIL to create a job definition, you enter the jil command to display the jil prompt.
- Job Name
- Job Type
- Owner
2.2.2 Defining Job Name Standards
- The job name is used to identify the job to AutoSys
- The name of a box or a job is limited it 30 characters in length and is terminated with white space. Embedded blanks and tabs are illegal.
- The first two characters must be the Airline code. The next three characters identify the project code or module that the job references. The next three characters identify the frequency of the job (day, wky, cal, orq). The next character identifies the priority of the job (1, 2, or 9). The rest of the characters can be used to give the job descriptive name.
- Job types (command, box, file watcher) cannot use the same name.
2.2.3 JIL Sub Commands
When writing a JIL script, you must follow the syntax rules. JIL sub-commands are used to create, modify, override, or delete a job definition. These sub-commands are listed below:
· insert_job : Add a new job to AutoSys.
· update_job : Edit fields on an existing job.
· delete_job : Delete an existing job from the AutoSys database.
· box_name : Add a new box job
· delete_box : Delete an existing Box job, and also delete all the jobs which are contained in the box.
· override_job: Apply overrides on indicated job attributes for the next run. It change the behavior of a job for the next time the job runs
2.2.4 Job Types
There are three types of jobs:
- Command Job - c (key words)
- Box Job - b (key words)
- File Watcher Job - f (key words)
As their names imply, Command Jobs execute commands, Box Jobs are containers that hold other jobs (including other boxes), and File Watcher Jobs watch for the arrival of a specified file.
2.2.4 Job Owner
The job owner specifies whose user ID the command will be run under on the client machine.
Attribute: < owner : cs6_dev@syd0432 >
2.2.5 Submitting Job Definitions
A completed JIL script is called a job definition. This job definition must be submitted to the AutoSys database before the job it defines can be run. You can submit a job to the AutoSys database using one of the following methods:
2.2.6 Running JIL
ü After a job definition has been submitted to the AutoSys database, it will be started according to the starting parameters specified in its JIL script.
ü The Event Processor will continually poll the database that the starting parameters have been met, it will run the job.
ü If a JIL script does not specify any starting parameters, the job will not be started automatically by the Event Processor.
ü It will start only if we issue the SENDEVENT command.
sendevent -E FORCE_STARTJOB -J <job_name>
(or)
fsj <job_name>
3 Basic Job Attribute
The primary differences between the job types are the actions that are taken when the job is run.
As their names imply, Command Jobs execute commands, Box Jobs are containers that hold other jobs (including other boxes), and File Watcher Jobs watch for the arrival of a specified file.
A File Watcher Job is similar to a Command Job. When that file reaches a certain minimum size, and is no longer growing in size, the File Watcher Job completes successfully, indicating that the file has arrived.
The basic File Watcher Job definition has the following required attributes:
ü job_name and job_type must be f
ü watch_file The name of the file to watch, must include the full path to the file
ü < watch_file: ${inb}/commcdata/<file_name>_${stream_name}.DLY
ü machine The name of the machine on which the command is to be run.
ü conditions The date/time and/or job status conditions necessary for the job to be run.
ü watch_file_min_size determines when enough data has been written to the file to consider it “complete.” This attribute is specified in bytes.
ü watch_interval specifies (in seconds) how often the File Watcher should check the current file size. The default is every 60 seconds.
The Box Job is a container of other jobs. A Box Job can be used to organize and control process flow. The box itself performs no actions, although it can trigger other jobs to run. An important feature of this type of job is that boxes can be put inside of other boxes. When this is done, jobs related by like starting conditions (not by similar application types) can be grouped and operated on in a logical way.
3.2.1 Default Box Job Behavior
Some important rules to remember about boxes are:
ü Box must have a starting conditions (either date/time conditions or job dependency conditions).
ü Jobs run only once per box execution.
ü Jobs in a box will start only if the box itself is running
ü As long as any job in a box is running, the box remains in RUNNING state; the box cannot complete until all jobs have run.
ü By default, a box will return a status of SUCCESS only when all the jobs in the box have run and the status of all the jobs is "success."
ü By default, a box will return a status of FAILURE only when all jobs in the box have run and the status of one or more of the jobs is "failure."
ü Changing the state of a box to INACTIVE (via the sendevent command) changes the state of all the jobs in the box to INACTIVE.
sendevent -e CHANGE_STATUS -s INACTIVE -j <box_name>
3.2.2 When you Should Not Use a Box
Avoid the temptation to put jobs in a box as a short cut for performing events (such as ON_ICE or ON_HOLD) on a large number of jobs at once.
3.3.3 What Happens when a Box Runs
ü As soon as a box starts running, all the jobs in the box (including sub-boxes) change to status ACTIVATED, meaning they are eligible to run.
ü Then each job is analyzed for additional starting conditions. All jobs with no additional starting conditions are started, without any implied ordering or prioritizing.
ü Jobs with additional starting conditions remain in the ACTIVATED state until those additional dependencies have been met. The box remains in the RUNNING state as long as there are activated or running jobs in the box.
ü If a box is terminated before a job in it was able to start, the status of that job will change directly from ACTIVATED to INACTIVE.
3.3.4 Time Conditions in a Box
Each job in a box will run only once per box execution. Therefore, you should not define more than one time attribute for any job in a box because the job will only run the first time.
ü If you want to put a job in a box, but you also want it to run more than once, you must assign multiple start time conditions to the box itself
ü Remember also that the box must be running before the job can start. Do not assign a start time for a job in a box if the box will not be running at that time. If you do, the next time the box starts the job will start immediately.
3.3.5 How Job Status Changes Affect Box Status
If a box contained only one job, and the job changed status, the box status would change.
How Job Status Changes Impact Box Status
Current BOX Status
|
New JOB Status
|
New BOX Status
|
SUCCESS
|
TERMINATED or FAILURE
|
FAILURE
|
SUCCESS
|
SUCCESS
|
NO CHANGE
|
FAILURE
|
SUCCESS
|
SUCCESS
|
FAILURE
|
FAILURE
|
NO CHANGE
|
INACTIVE
|
SUCCESS
|
SUCCESS
|
INACTIVE
|
TERMINATED or FAILURE
|
FAILURE
|
TERMINATED
|
ANY CHANGE
|
NO CHANGE
|
Eg)
{
insert_job: < box_name>
job_type: b
.....
insert_job: job_name1 job_type: c
box_name : < box_name>
.....
}
- The command can be a shell script or any executable program.
- When this type of job is run, the result is the execution of a specified command on a client machine.
- When all the starting conditions are met, AutoSys runs this command and captures its exit code upon completion.
- The exit event (either SUCCESS or FAILURE) and the exit code value are stored in the database.
Profile Script
ü For each job, you can specify a script to be sourced before the execution of the command that defines the environment in which the command is to be run.
ü All commands are run under the Bourne shell (/bin/sh). Therefore, all statements in the profile must use /bin/sh syntax.
ü If a profile is not specified, the default AutoSys profile, /etc/auto.profile, is used.
ü If the profile attribute is specified, that profile is searched for on the machine on which the command is to run
/* ----------------- template ----------------- */
insert_job: template job_type: c
box_name: box1
command: ls -l
machine: localhost
4.Autosys Status Abbreviations
Following are the status of Autosys jobs:
Ø INACTIVE (IN): i) The job has not yet been processed. ii) Either the job has never been run, or its status was intentionally altered to “turn off” its previous completion status.
Ø ACTIVATED (AC): Job is now in the RUNNING state, but the job itself has not started yet
Ø STARTING (ST): The event processor has initiated the start job procedure with the Remote Agent.
Ø RUNNING (RU): The job is running.
Ø SUCCESS (SU): Jobs or Box job finished and the status has been successes.
Ø FAILURE (FA): The job failed with an exit code and AutoSys issues an <alarm_if_fail :1> if a job fails
Ø TERMINATED (TE): i) The job terminated while in the RUNNING state.
Ø ii) If a user sends a KILLJOB sendevent or if it was defined to terminate if the box it is in failed. iii) If the job itself fails, it has a FAILURE status, not a TERMINATED status.
Ø iv) A job may also be terminated if it has exceeded the maximum run time (term_run_time attribute, if one was specified for the job), or through a UNIX kill command. v) AutoSys issues an alarm if a job is terminated.
Ø RESTART (RE): The job was unable to start due to hardware problems, and has been scheduled to restart.
Ø QUE_WAIT (QW): The job can logically run (all the starting conditions have been met), but there are not enough machine resources available.
Ø ON_HOLD (OH): This job is on hold and will not be run until it receives the JOB_OFF_HOLD event.
Ø ON_ICE (OI): This job is removed from all conditions and logic, but is still defined to AutoSys. This condition is like deactivating the job. It will remain on ice until it receives the JOB_OFF_ICE event.
Ø REFRESH (RD/RF): RD -> Dependencies, RF -> Filewatcher
5. Events send by users
By using the sendevent command or the Send Event dialog, you can send execute events that affect the running of a job. These are the execute events that you can send, if you have the appropriate permissions:
- sendevent -E STARTJOB -J job_name
- sendevent -E FORCE_STARTJOB -J job_name
- sendevent -E JOB_ON_ICE -J job_name
- sendevent -E JOB_OFF_ICE -J job_name
- sendevent -E JOB_ON_HOLD -J job_name
- sendevent -E JOB_OFF_HOLD -J job_name
- sendevent -E SET_GLOBAL -G <job_name>=READY
- sendevent -E SET_GLOBAL -G <variables>=FREE
- sendevent -E STOP_DEMON - to stop AutoSys
- sendevent -E CHANGE_STATUS -S INACTIVE -J job_name
- sendevent -E KILLJOB -J job_name
some other sendevent commands [DELETE_JOB],[CHANGE_PROIRITY],[COMMENT],[SEND_SIGNAL][ALARM]
Syntax
sendevent -E event [-S autoserv_instance] [-A alarm] [-J job_name]
[-s status] [-C comment] [-P priority] [-M max_send_trys]
[-q job_queue_priority] [-T "time_of_event"]
[-G "global_name=value"] [-k signal_number(s)] [-u]
6. Autorep Commands
Reports information about a job, jobs within boxes, machines, and machine status. Also reports information about job overrides and global variables.
Syntax
autorep {-J ALL | job_name -M machine_name -G global_name} [-s -d (detail) -o (over_num/over_ride)] [-r run_number] [-q (> file_name)],
autorep -J (job name here) - Display the list of jobs with complete details with box/jobname, last/latest run date & time, status, exit code, etc.
autorep -J (job name here) -r (No of runs back) - Information of previous runs
example : autorep -J (job name here) -r 1
autorep -J (job name here) –q - Viewing JIL code for any Autosys jobautorep -J job_name –d – Details
7. Job Dependencies Related to Job Status
This is the syntax for conditions based on AutoSys job status:
condition: status(job_name)
• success (s) Indicates that the status condition for job_name is SUCCESS.
• failure (f) Indicates that the status condition for job_name is FAILURE.
• Done (d) Indicates that the status condition for job_name is SUCCESS, FAILURE or TERMINATED.
• terminated Indicates that the status condition for job_name is TERMINATED.
• notrunning Indicates that the status condition for job_name is anything except RUNNING.
• exitcode Indicates that a process will start based on the exitcode of a particular job_name
exitcode (job_name) operator value
where:
• job_name Is the name of the job upon which the “new” job is dependent.
• operator Is one of the following exitcode comparison operators:
• =, != (not equal), <, >, <=, or >=
• value Is any numeric value.
You can abbreviate the dependency specification exitcode with the letter e (uppercase or lowercase).
For the above example, you would enter the following for the job dependency specification for the “JobB” redial job:
e (JobA) = 4
You can use any job status or exit codes as part of the specification for starting conditions. With this latitude, you can program branching paths that will provide alternative actions for all types of error conditions.
8. JIL Sub_Commands
• command: the actual command to be executed
• start_mins: which minutes the job should be started every hour. This is a comma-separated list of minutes, e.g. 0,15,30,45
• start_times: actual times per day the job should be started. This is a comma-separated list of times, e.g. "5:15,15:30,23:59"
• start_days: comma-separated list of days on which the job should run, exclusive. These days are denoted using two-letter abbreviations: su,mo,tu,we,th,fr,sa
• condition: when a job should run based on the state of other jobs, e.g. "failure(JOB1) and (success(JOB2) or success(JOB3)) and notrunning(JOB4)"
• alarm_if_fail: boolean (0/1) value determining whether an alarm should be sent upon a job failure event.
• date_conditions: 1 -> determines whether time/day constraints will be taken into consideration when determining when to kick off jobs. If 0, start_times, start_mins, and start_days will be ignored, and job control will be based on the condition field only.
• std_in_file: file that will be redirected into the command as STDIN. Specify full path. Autosys does not provide this utility, but it should be trivial to add. (ADS Extension)
• std_out_file: file that will receive the job's STDOUT. Specify full path.
• std_err_file: file that will receive the job's STDERR. Specify full path.
• profile: file that will be sourced by the shell prior to kicking off the job.
• run_window: A range of hours, such as "7:00 - 10:00", which restricts the possible start times of the job. If you specify "start_mins: 15,30", then the job will run on 7:15, 7:30, 8:15 ... 10:30, and will then wait until the next available 7:15 (taking day restrictions into account).
• auto_delete: If set to 0, AutoSys will immediately delete job definitions only if the job completed successfully. If the job did not complete successfully, AutoSys will keep the job definition for 7 days before automatically deleting it. This attribute AutoSys schedule and run a one-time batch job.
• run_calendar: The days on which a job should/ should not be run can be specified by way of a custom calendar, Custom calendars, specified through the AutoSys Graphical Calendar Facility, or the autocal_asc command, can include any number of dates on which separate object with a unique name, and a calendar can be associated with one or more jobs
• autocal_asc The autocal command is used to start up the AutoSys Graphical Calendar Facility to define and maintain AutoSys calendars. Calendars are lists of dates that you can use to schedule the days on which jobs should, or should not, run, adds & deletes custom calendar definitions.
• days_of_week: The days of the week attribute specifies the days on which the job should be run. You can specify one or more days, or “all” for every day.
• exclude_calendar: Do NOT Run on Days in Calendar
Template example:
/* ----------------- template ----------------- */
1. insert_job: template job_type: c
box_name: box1
command: ls -l
machine: localhost
owner: lyota01@TANT-A01
permission: gx,ge,wx,we,mx,me
date_conditions: 1
days_of_week: all
start_times: "15:00, 14:00"
run_window: "14:00 - 6:00"
condition: s (job1)
description: "description field"
term_run_time: 60
box_terminator: 1
job_terminator: 1
std_out_file: /tmp/std_out
std_err_file: /tmp/std_err
min_run_alarm: 5
max_run_alarm: 10
alarm_if_fail: 1
max_exit_success: 2
chk_files: /tmp 2000
profile: /tmp/.profile
job_load: 25
priority: 1