`fleet`¶

The fleet configuration type allows creating and updating fleets.

Root reference¶

`name` - (Optional) The fleet name.¶

`env` - (Optional) The mapping or the list of environment variables.¶

`ssh_config` - (Optional) The parameters for adding instances via SSH.¶

`nodes` - (Optional) The number of instances.¶

`placement` - (Optional) The placement of instances: `any` or `cluster`.¶

`reservation` - (Optional) The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations and Capacity Blocks.¶

`resources` - (Optional) The resources requirements.¶

`blocks` - (Optional) The amount of blocks to split the instance into, a number or `auto`. `auto` means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to `1`, i.e. do not split. Defaults to `1`.¶

`backends` - (Optional) The backends to consider for provisioning (e.g., `[aws, gcp]`).¶

`regions` - (Optional) The regions to consider for provisioning (e.g., `[eu-west-1, us-west4, westeurope]`).¶

`availability_zones` - (Optional) The availability zones to consider for provisioning (e.g., `[eu-west-1a, us-west4-a]`).¶

`instance_types` - (Optional) The cloud-specific instance types to consider for provisioning (e.g., `[p3.8xlarge, n1-standard-4]`).¶

`spot_policy` - (Optional) The policy for provisioning spot or on-demand instances: `spot`, `on-demand`, `auto`. Defaults to `on-demand`.¶

`retry` - (Optional) The policy for provisioning retry. Defaults to `false`.¶

`max_price` - (Optional) The maximum instance price per hour, in dollars.¶

`idle_duration` - (Optional) Time to wait before terminating idle instances. Defaults to `5m` for runs and `3d` for fleets. Use `off` for unlimited duration.¶

`tags` - (Optional) The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them.¶

`ssh_config`¶

`user` - (Optional) The user to log in with on all hosts.¶

`port` - (Optional) The SSH port to connect to.¶

`identity_file` - (Optional) The private key to use for all hosts.¶

`proxy_jump` - (Optional) The SSH proxy configuration for all hosts.¶

`hosts` - The per host connection parameters: a hostname or an object that overrides default ssh parameters.¶

`network` - (Optional) The network address for cluster setup in the format `<ip>/<netmask>`. `dstack` will use IP addresses from this network for communication between hosts. If not specified, `dstack` will use IPs from the first found internal network..¶

`ssh_config.proxy_jump`¶

`hostname` - The IP address or domain of proxy host.¶

`port` - (Optional) The SSH port of proxy host.¶

`user` - The user to log in with for proxy host.¶

`identity_file` - The private key to use for proxy host.¶

`ssh_config.hosts[n]`¶

`hostname` - The IP address or domain to connect to.¶

`port` - (Optional) The SSH port to connect to for this host.¶

`user` - (Optional) The user to log in with for this host.¶

`identity_file` - (Optional) The private key to use for this host.¶

`proxy_jump` - (Optional) The SSH proxy configuration for this host.¶

`internal_ip` - (Optional) The internal IP of the host used for communication inside the cluster. If not specified, `dstack` will use the IP address from `network` or from the first found internal network..¶

`blocks` - (Optional) The amount of blocks to split the instance into, a number or `auto`. `auto` means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to `1`, i.e. do not split. Defaults to `1`.¶

`ssh_config.hosts[n].proxy_jump`¶

`hostname` - The IP address or domain of proxy host.¶

`port` - (Optional) The SSH port of proxy host.¶

`user` - The user to log in with for proxy host.¶

`identity_file` - The private key to use for proxy host.¶

`resources`¶

`cpu` - (Optional) The CPU requirements.¶

`memory` - (Optional) The RAM size (e.g., `8GB`). Defaults to `8GB..`.¶

`shm_size` - (Optional) The size of shared memory (e.g., `8GB`). If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure this.¶

`gpu` - (Optional) The GPU requirements.¶

`disk` - (Optional) The disk resources.¶

`resources.cpu`¶

`arch` - (Optional) The CPU architecture, one of: `x86`, `arm`.¶

`count` - (Optional) The number of CPU cores. Defaults to `2..`.¶

`resources.gpu`¶

`vendor` - (Optional) The vendor of the GPU/accelerator, one of: `nvidia`, `amd`, `google` (alias: `tpu`), `intel`.¶

`name` - (Optional) The name of the GPU (e.g., `A100` or `H100`).¶

`count` - (Optional) The number of GPUs. Defaults to `1..`.¶

`memory` - (Optional) The RAM size (e.g., `16GB`). Can be set to a range (e.g. `16GB..`, or `16GB..80GB`).¶

`total_memory` - (Optional) The total RAM size (e.g., `32GB`). Can be set to a range (e.g. `16GB..`, or `16GB..80GB`).¶

`compute_capability` - (Optional) The minimum compute capability of the GPU (e.g., `7.5`).¶

`resources.disk`¶

`size` - Disk size.¶

`retry`¶

`on_events` - (Optional) The list of events that should be handled with retry. Supported events are `no-capacity`, `interruption`, `error`. Omit to retry on all events.¶

`duration` - (Optional) The maximum period of retrying the run, e.g., `4h` or `1d`.¶

fleet¶

Root reference¶

name - (Optional) The fleet name.¶

env - (Optional) The mapping or the list of environment variables.¶

ssh_config - (Optional) The parameters for adding instances via SSH.¶

nodes - (Optional) The number of instances.¶

placement - (Optional) The placement of instances: any or cluster.¶

reservation - (Optional) The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations and Capacity Blocks.¶

resources - (Optional) The resources requirements.¶

blocks - (Optional) The amount of blocks to split the instance into, a number or auto. auto means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to 1, i.e. do not split. Defaults to 1.¶

backends - (Optional) The backends to consider for provisioning (e.g., [aws, gcp]).¶

regions - (Optional) The regions to consider for provisioning (e.g., [eu-west-1, us-west4, westeurope]).¶

availability_zones - (Optional) The availability zones to consider for provisioning (e.g., [eu-west-1a, us-west4-a]).¶

instance_types - (Optional) The cloud-specific instance types to consider for provisioning (e.g., [p3.8xlarge, n1-standard-4]).¶

spot_policy - (Optional) The policy for provisioning spot or on-demand instances: spot, on-demand, auto. Defaults to on-demand.¶

retry - (Optional) The policy for provisioning retry. Defaults to false.¶

max_price - (Optional) The maximum instance price per hour, in dollars.¶

idle_duration - (Optional) Time to wait before terminating idle instances. Defaults to 5m for runs and 3d for fleets. Use off for unlimited duration.¶

tags - (Optional) The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them.¶

ssh_config¶

user - (Optional) The user to log in with on all hosts.¶

port - (Optional) The SSH port to connect to.¶

identity_file - (Optional) The private key to use for all hosts.¶

proxy_jump - (Optional) The SSH proxy configuration for all hosts.¶

hosts - The per host connection parameters: a hostname or an object that overrides default ssh parameters.¶

network - (Optional) The network address for cluster setup in the format <ip>/<netmask>. dstack will use IP addresses from this network for communication between hosts. If not specified, dstack will use IPs from the first found internal network..¶

ssh_config.proxy_jump¶

hostname - The IP address or domain of proxy host.¶

port - (Optional) The SSH port of proxy host.¶

user - The user to log in with for proxy host.¶

identity_file - The private key to use for proxy host.¶

ssh_config.hosts[n]¶

hostname - The IP address or domain to connect to.¶

port - (Optional) The SSH port to connect to for this host.¶

user - (Optional) The user to log in with for this host.¶

identity_file - (Optional) The private key to use for this host.¶

proxy_jump - (Optional) The SSH proxy configuration for this host.¶

internal_ip - (Optional) The internal IP of the host used for communication inside the cluster. If not specified, dstack will use the IP address from network or from the first found internal network..¶

blocks - (Optional) The amount of blocks to split the instance into, a number or auto. auto means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to 1, i.e. do not split. Defaults to 1.¶

ssh_config.hosts[n].proxy_jump¶

hostname - The IP address or domain of proxy host.¶

port - (Optional) The SSH port of proxy host.¶

user - The user to log in with for proxy host.¶

identity_file - The private key to use for proxy host.¶

resources¶

cpu - (Optional) The CPU requirements.¶

memory - (Optional) The RAM size (e.g., 8GB). Defaults to 8GB...¶

shm_size - (Optional) The size of shared memory (e.g., 8GB). If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure this.¶

gpu - (Optional) The GPU requirements.¶

disk - (Optional) The disk resources.¶

resources.cpu¶

arch - (Optional) The CPU architecture, one of: x86, arm.¶

count - (Optional) The number of CPU cores. Defaults to 2...¶

resources.gpu¶

vendor - (Optional) The vendor of the GPU/accelerator, one of: nvidia, amd, google (alias: tpu), intel.¶

name - (Optional) The name of the GPU (e.g., A100 or H100).¶

count - (Optional) The number of GPUs. Defaults to 1...¶

memory - (Optional) The RAM size (e.g., 16GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB).¶

total_memory - (Optional) The total RAM size (e.g., 32GB). Can be set to a range (e.g. 16GB.., or 16GB..80GB).¶

compute_capability - (Optional) The minimum compute capability of the GPU (e.g., 7.5).¶

resources.disk¶

size - Disk size.¶

retry¶

on_events - (Optional) The list of events that should be handled with retry. Supported events are no-capacity, interruption, error. Omit to retry on all events.¶

duration - (Optional) The maximum period of retrying the run, e.g., 4h or 1d.¶

`fleet`¶

`name` - (Optional) The fleet name.¶

`env` - (Optional) The mapping or the list of environment variables.¶

`ssh_config` - (Optional) The parameters for adding instances via SSH.¶

`nodes` - (Optional) The number of instances.¶

`placement` - (Optional) The placement of instances: `any` or `cluster`.¶

`reservation` - (Optional) The existing reservation to use for instance provisioning. Supports AWS Capacity Reservations and Capacity Blocks.¶

`resources` - (Optional) The resources requirements.¶

`blocks` - (Optional) The amount of blocks to split the instance into, a number or `auto`. `auto` means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to `1`, i.e. do not split. Defaults to `1`.¶

`backends` - (Optional) The backends to consider for provisioning (e.g., `[aws, gcp]`).¶

`regions` - (Optional) The regions to consider for provisioning (e.g., `[eu-west-1, us-west4, westeurope]`).¶

`availability_zones` - (Optional) The availability zones to consider for provisioning (e.g., `[eu-west-1a, us-west4-a]`).¶

`instance_types` - (Optional) The cloud-specific instance types to consider for provisioning (e.g., `[p3.8xlarge, n1-standard-4]`).¶

`spot_policy` - (Optional) The policy for provisioning spot or on-demand instances: `spot`, `on-demand`, `auto`. Defaults to `on-demand`.¶

`retry` - (Optional) The policy for provisioning retry. Defaults to `false`.¶

`max_price` - (Optional) The maximum instance price per hour, in dollars.¶

`idle_duration` - (Optional) Time to wait before terminating idle instances. Defaults to `5m` for runs and `3d` for fleets. Use `off` for unlimited duration.¶

`tags` - (Optional) The custom tags to associate with the resource. The tags are also propagated to the underlying backend resources. If there is a conflict with backend-level tags, does not override them.¶

`ssh_config`¶

`user` - (Optional) The user to log in with on all hosts.¶

`port` - (Optional) The SSH port to connect to.¶

`identity_file` - (Optional) The private key to use for all hosts.¶

`proxy_jump` - (Optional) The SSH proxy configuration for all hosts.¶

`hosts` - The per host connection parameters: a hostname or an object that overrides default ssh parameters.¶

`network` - (Optional) The network address for cluster setup in the format `<ip>/<netmask>`. `dstack` will use IP addresses from this network for communication between hosts. If not specified, `dstack` will use IPs from the first found internal network..¶

`ssh_config.proxy_jump`¶

`hostname` - The IP address or domain of proxy host.¶

`port` - (Optional) The SSH port of proxy host.¶

`user` - The user to log in with for proxy host.¶

`identity_file` - The private key to use for proxy host.¶

`ssh_config.hosts[n]`¶

`hostname` - The IP address or domain to connect to.¶

`port` - (Optional) The SSH port to connect to for this host.¶

`user` - (Optional) The user to log in with for this host.¶

`identity_file` - (Optional) The private key to use for this host.¶

`proxy_jump` - (Optional) The SSH proxy configuration for this host.¶

`internal_ip` - (Optional) The internal IP of the host used for communication inside the cluster. If not specified, `dstack` will use the IP address from `network` or from the first found internal network..¶

`blocks` - (Optional) The amount of blocks to split the instance into, a number or `auto`. `auto` means as many as possible. The number of GPUs and CPUs must be divisible by the number of blocks. Defaults to `1`, i.e. do not split. Defaults to `1`.¶

`ssh_config.hosts[n].proxy_jump`¶

`hostname` - The IP address or domain of proxy host.¶

`port` - (Optional) The SSH port of proxy host.¶

`user` - The user to log in with for proxy host.¶

`identity_file` - The private key to use for proxy host.¶

`resources`¶

`cpu` - (Optional) The CPU requirements.¶

`memory` - (Optional) The RAM size (e.g., `8GB`). Defaults to `8GB..`.¶

`shm_size` - (Optional) The size of shared memory (e.g., `8GB`). If you are using parallel communicating processes (e.g., dataloaders in PyTorch), you may need to configure this.¶

`gpu` - (Optional) The GPU requirements.¶

`disk` - (Optional) The disk resources.¶

`resources.cpu`¶

`arch` - (Optional) The CPU architecture, one of: `x86`, `arm`.¶

`count` - (Optional) The number of CPU cores. Defaults to `2..`.¶

`resources.gpu`¶

`vendor` - (Optional) The vendor of the GPU/accelerator, one of: `nvidia`, `amd`, `google` (alias: `tpu`), `intel`.¶

`name` - (Optional) The name of the GPU (e.g., `A100` or `H100`).¶

`count` - (Optional) The number of GPUs. Defaults to `1..`.¶

`memory` - (Optional) The RAM size (e.g., `16GB`). Can be set to a range (e.g. `16GB..`, or `16GB..80GB`).¶

`total_memory` - (Optional) The total RAM size (e.g., `32GB`). Can be set to a range (e.g. `16GB..`, or `16GB..80GB`).¶

`compute_capability` - (Optional) The minimum compute capability of the GPU (e.g., `7.5`).¶

`resources.disk`¶

`size` - Disk size.¶

`retry`¶

`on_events` - (Optional) The list of events that should be handled with retry. Supported events are `no-capacity`, `interruption`, `error`. Omit to retry on all events.¶

`duration` - (Optional) The maximum period of retrying the run, e.g., `4h` or `1d`.¶