Skip to content

task

The task configuration type allows running tasks.

Root reference

nodes - (Optional) Number of nodes. Defaults to 1.
name - (Optional) The run name. If not specified, a random name is generated.
image - (Optional) The name of the Docker image to run.
user - (Optional) The user inside the container, user_name_or_id[:group_name_or_id] (e.g., ubuntu, 1000:1000). Defaults to the default image user.
privileged - (Optional) Run the container in privileged mode.
entrypoint - (Optional) The Docker entrypoint.
working_dir - (Optional) The path to the working directory inside the container. It's specified relative to the repository directory (/workflow) and should be inside it. Defaults to "." .
registry_auth - (Optional) Credentials for pulling a private Docker image.
python - (Optional) The major version of Python. Mutually exclusive with image.
nvcc - (Optional) Use image with NVIDIA CUDA Compiler (NVCC) included. Mutually exclusive with image.
env - (Optional) The mapping or the list of environment variables.
resources - (Optional) The resources requirements to run the configuration.
volumes - (Optional) The volumes mount points.
ports - (Optional) Port numbers/mapping to expose.
commands - (Optional) The bash commands to run.
backends - (Optional) The backends to consider for provisioning (e.g., [aws, gcp]).
regions - (Optional) The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope]).
instance_types - (Optional) The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4]).
reservation - (Optional) The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations and Capacity Blocks.
spot_policy - (Optional) The policy for provisioning spot or on-demand instances: spot, on-demand, or auto. Defaults to on-demand.
retry - (Optional) The policy for resubmitting the run. Defaults to false.
max_duration - (Optional) The maximum duration of a run (e.g., 2h, 1d, etc). After it elapses, the run is forced to stop. Defaults to off.
max_price - (Optional) The maximum instance price per hour, in dollars.
creation_policy - (Optional) The policy for using instances from the pool. Defaults to reuse-or-create.
idle_duration - (Optional) Time to wait before terminating idle instances. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration.
termination_policy - (Optional) Deprecated in favor of idle_duration.
termination_idle_time - (Optional) Deprecated in favor of idle_duration.

retry

on_events - The list of events that should be handled with retry. Supported events are no-capacity, interruption, and error.
duration - (Optional) The maximum period of retrying the run, e.g., 4h or 1d.

resources

cpu - (Optional) The number of CPU cores. Defaults to 2...
memory - (Optional) The RAM size (e.g., 8GB). Defaults to 8GB...
shm_size - (Optional) The size of shared memory (e.g., 8GB). If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure this.
gpu - (Optional) The GPU requirements. Can be set to a number, a string (e.g. A100, 80GB:2, etc.), or an object.
disk - (Optional) The disk resources.

resouces.gpu

vendor - (Optional) The vendor of the GPU/accelerator, one of: nvidia, amd, google (alias: tpu).
name - (Optional) The GPU name or list of names.
count - (Optional) The number of GPUs. Defaults to 1.
memory - (Optional) The RAM size (e.g., 16GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB).
total_memory - (Optional) The total RAM size (e.g., 32GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB).
compute_capability - (Optional) The minimum compute capability of the GPU (e.g., 7.5).

resouces.disk

size - The disk size. Can be set to a range (e.g., 100GB.. or 100GB..200GB).

registry_auth

username - The username.
password - The password or access token.

volumes[n]

name - The network volume name or the list of network volume names to mount. If a list is specified, one of the volumes in the list will be mounted. Specify volumes from different backends/regions to increase availability..
path - The absolute container path to mount the volume at.
instance_path - The absolute path on the instance (host).
path - The absolute path in the container.
Short syntax

The short syntax for volumes is a colon-separated string in the form of source:destination

  • volume-name:/container/path for network volumes
  • /instance/path:/container/path for instance volumes