Skip to content

Pools

Pools enable the efficient reuse of cloud instances and on-premises servers across runs, simplifying their management.

Adding instances

Automatic provisioning

By default, when using the dstack run command, it tries to reuse an instance from a pool. If no idle instance meets the requirements, dstack automatically provisions a new cloud instance and adds it to the pool.

Reuse policy

To avoid provisioning new cloud instances with dstack run, use --reuse. Your run will be assigned to an idle instance in the pool. If there are no available idle instances in the pool, the run will fail.

Idle duration

By default, dstack run sets the idle duration of a newly provisioned instance to 5m. This means that if the run is finished and the instance remains idle for longer than five minutes, it is automatically removed from the pool. To override the default idle duration, use --idle-duration DURATION with dstack run.

Manual provisioning

To manually provision a cloud instance and add it to a pool, use dstack pool add:

$ dstack pool add --gpu 80GB

 BACKEND     REGION         RESOURCES                     SPOT  PRICE
 tensordock  unitedkingdom  10xCPU, 80GB, 1xA100 (80GB)   no    $1.595
 azure       westus3        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673
 azure       westus2        24xCPU, 220GB, 1xA100 (80GB)  no    $3.673

Continue? [y/n]: y

The dstack pool add command allows specifying resource requirements, along with the spot policy, idle duration, max price, retry policy, and other policies.

Idle duration

The default idle duration if you're using dstack pool add is 72h. To override it, use the --idle-duration DURATION argument.

You can also specify the policies via .dstack/profiles.yml instead of passing them as arguments. For more details on policies and their defaults, refer to .dstack/profiles.yml.

Limitations

The dstack pool add command is not supported for Kubernetes, VastAI, and RunPod backends yet.

Adding on-prem clusters

Any on-prem server that can be accessed via SSH can be added to a pool and used to run workloads.

To add on-prem servers to the pool, use the dstack pool add-ssh command and pass the hostname of your server along with the SSH key.

$ dstack pool add-ssh -i ~/.ssh/id_rsa ubuntu@54.73.155.119

The command accepts the same arguments as the standard ssh command.

Requirements

The on-prem server should be pre-installed with CUDA 12.1 and NVIDIA Docker.

Once the instance is provisioned, you'll see it in the pool and will be able to run workloads on it.

Clusters

If you want on-prem instances to run multi-node tasks, ensure these on-prem servers share the same private network. Additionally, you need to pass the --network option to dstack pool add-ssh:

$ dstack pool add-ssh -i ~/.ssh/id_rsa ubuntu@54.73.155.119 \
    --network 10.0.0.0/24

The --network argument accepts the IP address range (CIDR) of the private network of the instance.

Once you've added multiple instances with the same network value, you can use them as a cluster to run multi-node tasks.

Removing instances

If the instance remains idle for the configured idle duration, dstack removes it and deletes all cloud resources.

To remove an instance from the pool manually, use the dstack pool rm command.

$ dstack pool rm <instance name>

List instances

The dstack pool ps command lists active instances and their status (busy or idle).