Configuring Jobs

Jobs are configured by files in the jobs directory. The jobs directory may contain subdirectories, to organize jobs. Each job file uses the .yaml file suffix, and specifies one job, with a unique job ID, in YAML format.

The Apsis config file specifies the location of the jobs directory; see [[config]].

A job config contains these top-level keys:

  • params (optional)

  • program, which specifies what to run

  • schedule, which specifies when to schedule runs

  • metadata (optional), additional information not interpreted by Apsis

  • conditions (optional) that must be met for a run

  • actions (optional) to take when a run changes state

For durations in seconds, you may also use durations like 30s, 10 min (600 seconds), 1.5h (5400 seconds), and 1 day (86400 seconds).

Job ID

The job’s ID is given by the path under the jobs directory, with the .yaml suffix removed. For example, if the jobs directory is /path/to/jobs, the job file /path/to/jobs/data/pipeline/start.yaml has the job ID data/pipeline/start.

Params

The params key in the job config takes a list of parameter names. For example,

params:
  - date
  - message

or equivalently,

params: ["date", "message"]

If params is omitted, the job has no parameters.

Parameters aren’t required; a job without parameters can be run repeatedly, just like a cron job.

Program

The program key program describes how a run executes. Apsis provides several types of programs, and you may extend Apsis with additional program types as well.

See Programs for more information.

Schedule

The schedule key specifies when new runs are created and for when they are scheduled.

A job may have a single schedule, given as a dict, or multiple schedules, as a list of dicts.

# Single schedule
schedule:
    type: interval
    interval: 3600

# Two schedules
schedule:
  - type: interval
    interval: 3600
  - type: daily
    tz: UTC
    daytime: 12:00:00

See Configuring Schedules for more information.

Metadata

A job can store arbitrary metadata, such as descriptive text, tags, and operator instructions. The metadata key accepts arbitrary subkeys. None affect how runs of the job are executed.

Apsis does understand certain metadata keys. The description key contains descriptive Markdown text shown in the UI.

metadata:
    description: |
        Daily cleanup job.

        Removes temporary files that have been created within the last 24
        hours.

The labels key is an array of string labels, also shown in the UI.

metadata:
    labels:
        - test
        - blue-team

Any other metadata keys are preserved but ignored by Apsis.

Conditions

The condition key describes zero or more conditions that must be satisified before Apsis starts the run’s program. Apsis provides several types of conditions, and you may extend Apsis with additional condition types as well.

See Conditions for more information.

Actions

FIXME: Write this.

Binding

Apsis creates specific runs for a job, according to the job’s schedule. When Apsis creates a run, it binds the run’s arguments in the program and conditions. Each string-valued config field is expanded as a jinja2 template. The run’s args are available as substitution variables.

For example, consider this job config:

params:
- color
- fruit

program:
    type: shell
    command: "echo The color of {{ fruit }} is {{ color }}."

When Apsis creates a run with color: red and fruit: apple, it expands the program to,

program:
    type: shell
    command: "echo The color of apple is red."

The contents of a {{ … }} expansion is evaluated as a jinja2 expression. The following additional Ora types and functions are available:

These functions and types allow you to perform time computations on program and condition dates and times. For example, this job has a dependency on another job load data. Each run of this job is labeled with a date, and depends on a load data run with the previous date, according to the workdays calendar.

params: [region, date]

...

condition:
    type: dependency
    job_id: load data
    args:
        date: {{ get_calendar('workdays').before(date) }}

Keep in mind that Apsis run arguments are always strings, so Apsis converts the result using str.