Job Task
Usage
job_task(
task_key,
description = NULL,
depends_on = c(),
existing_cluster_id = NULL,
new_cluster = NULL,
job_cluster_key = NULL,
task,
libraries = NULL,
email_notifications = NULL,
timeout_seconds = NULL,
max_retries = 0,
min_retry_interval_millis = 0,
retry_on_timeout = FALSE
)
Arguments
- task_key
A unique name for the task. This field is used to refer to this task from other tasks. This field is required and must be unique within its parent job. On
db_jobs_update()
ordb_jobs_reset()
, this field is used to reference the tasks to be updated or reset. The maximum length is 100 characters.- description
An optional description for this task. The maximum length is 4096 bytes.
- depends_on
Vector of
task_key
's specifying the dependency graph of the task. Alltask_key
's specified in this field must complete successfully before executing this task. This field is required when a job consists of more than one task.- existing_cluster_id
ID of an existing cluster that is used for all runs of this task.
- new_cluster
Instance of
new_cluster()
.- job_cluster_key
Task is executed reusing the cluster specified in
db_jobs_create()
withjob_clusters
parameter.- task
One of
notebook_task()
,spark_jar_task()
,spark_python_task()
,spark_submit_task()
,pipeline_task()
,python_wheel_task()
.- libraries
Instance of
libraries()
.- email_notifications
Instance of email_notifications.
- timeout_seconds
An optional timeout applied to each run of this job task. The default behavior is to have no timeout.
- max_retries
An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with the
FAILED
result_state
orINTERNAL_ERROR
life_cycle_state.
The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry.- min_retry_interval_millis
Optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
- retry_on_timeout
Optional policy to specify whether to retry a task when it times out. The default behavior is to not retry on timeout.