Volumes¶
Volumes allow you to persist data between runs. dstack
supports two kinds of volumes: network volumes
and instance volumes.
Network volumes¶
dstack
allows to create and attach network volumes to dev environments, tasks, and services.
Network volumes are currently supported with the
aws
,gcp
, andrunpod
backends. Support for other backends and SSH fleets is coming soon.
Define a configuration¶
First, create a YAML file in your project folder. Its name must end with .dstack.yml
(e.g. .dstack.yml
or vol.dstack.yml
are both acceptable).
type: volume
# A name of the volume
name: my-new-volume
# Volumes are bound to a specific backend and region
backend: aws
region: eu-central-1
# Required size
size: 100GB
If you use this configuration, dstack
will create a new volume based on the specified options.
Registering existing volumes
If you prefer not to create a new volume but to reuse an existing one (e.g., created manually), you can
specify its ID via volume_id
. In this case, dstack
will register the specified volume so that you can use it with dev environments, tasks, and services.
Reference
See .dstack.yml for all the options supported by volumes, along with multiple examples.
Create, register, or update a volume¶
To create or register the volume, simply call the dstack apply
command:
$ dstack apply -f volume.dstack.yml
Volume my-new-volume does not exist yet. Create the volume? [y/n]: y
NAME BACKEND REGION STATUS CREATED
my-new-volume aws eu-central-1 submitted now
When creating the volume
dstack
automatically creates anext4
file system on it.
Once created, the volume can be attached with dev environments, tasks, and services.
Attach a volume¶
Dev environments, tasks, and services let you attach any number of network volumes.
To attach a network volume, simply specify its name using the volumes
property
and specify where to mount its contents:
type: dev-environment
# A name of the dev environment
name: vscode-vol
ide: vscode
# Map the name of the volume to any path
volumes:
- name: my-new-volume
path: /volume_data
# You can also use the short syntax in the `name:path` form
# volumes:
# - my-new-volume:/volume_data
Once you run this configuration, the contents of the volume will be attached to /volume_data
inside the dev environment,
and its contents will persist across runs.
Attaching volumes across regions and backends
If you're unsure in advance which region or backend you'd like to use (or which is available), you can specify multiple volumes for the same path.
volumes:
- name: [my-aws-eu-west-1-volume, my-aws-us-east-1-volume]
path: /volume_data
dstack
will attach one of the volumes based on the region and backend of the run.
Limitations
When you're running a dev environment, task, or service with dstack
, it automatically mounts the project folder contents
to /workflow
(and sets that as the current working directory). Right now, dstack
doesn't allow you to
attach volumes to /workflow
or any of its subdirectories.
Manage volumes¶
List volumes¶
The dstack volume list
command lists created and registered volumes:
$ dstack volume list
NAME BACKEND REGION STATUS CREATED
my-new-volume aws eu-central-1 active 3 weeks ago
Delete volumes¶
When the volume isn't attached to any active dev environment, task, or service, you can delete it using dstack delete
:
$ dstack delete -f vol.dstack.yaml
If the volume was created using dstack
, it will be physically destroyed along with the data.
If you've registered an existing volume, it will be de-registered with dstack
but will keep the data.
Instance volumes¶
Instance volumes are currently supported on all backends except
runpod
,vastai
andkubernetes
.
Unlike network volumes, which are persistent external resources mounted over network, instance volumes are part of the instance storage. Basically, the instance volume is a filesystem path (a directory or a file) mounted inside the run container.
As a consequence, the contents of the instance volume are specific to the instance where the run is executed, and data persistence, integrity, and even existence are guaranteed only if the subsequent run is executed on the same exact instance, and there is no other runs in between.
Manage volumes¶
You don't need to create or delete instance volumes, and they are not displayed in the
dstack volume list
command output.
Attach a volume¶
Dev environments, tasks, and services let you attach any number of instance volumes.
To attach an instance volume, specify the instance_path
and path
in the volumes
property:
type: dev-environment
# A name of the dev environment
name: vscode-vol
ide: vscode
# Map the instance path to any container path
volumes:
- instance_path: /mnt/volume
path: /volume_data
# You can also use the short syntax in the `instance_path:path` form
# volumes:
# - /mnt/volume:/volume_data
Use cases¶
Despite the limitations, instance volumes can still be useful in some cases:
For example, if runs regularly install packages with pip install
, include the instance volume in the run configuration
to reuse pip cache between runs:
type: task
volumes:
- /dstack-cache/pip:/root/.cache/pip
If you manage your own instances, you can mount network storages (e.g., NFS or SMB) to the hosts and access them in the runs.
Imagine you mounted the same network storage to all the fleet instances using the same path /mnt/nfs-storage
,
then you can treat the instance volume as a shared persistent storage:
type: task
volumes:
- /mnt/nfs-storage:/storage
FAQ¶
Can I use network volumes across backends?¶
Since volumes are backed up by cloud network disks, you can only use them within the same cloud. If you need to access data across different backends, you should either use object storage or replicate the data across multiple volumes.
Can I use network volumes across regions?¶
Typically, network volumes are associated with specific regions, so you can't use them in other regions. Often, volumes are also linked to availability zones, but some providers support volumes that can be used across different availability zones within the same region.
If you don't want to limit a run to one particular region, you can create different volumes for different regions and specify them for the same mount point as documented above.
Can I attach network volumes to multiple runs or instances?¶
You can mount a volume in multiple runs. This feature is currently supported only by the runpod
backend.