Volumes¶
Volumes enable data persistence between runs of dev environments, tasks, and services.
dstack
supports two kinds of volumes:
- Network volumes — provisioned via backends and mounted to specific container directories. Ideal for persistent storage.
- Instance volumes — bind directories on the host instance to container directories. Useful as a cache for cloud fleets or for persistent storage with SSH fleets.
Network volumes¶
Network volumes are currently supported for the aws
, gcp
, and runpod
backends.
Define a configuration¶
First, define a volume configuration as a YAML file in your project folder.
The filename must end with .dstack.yml
(e.g. .dstack.yml
or volume.dstack.yml
are both acceptable).
type: volume
# A name of the volume
name: my-volume
# Volumes are bound to a specific backend and region
backend: aws
region: eu-central-1
# Required size
size: 100GB
If you use this configuration, dstack
will create a new volume based on the specified options.
Register existing volumes
If you prefer not to create a new volume but to reuse an existing one (e.g., created manually), you can
specify its ID via volume_id
. In this case, dstack
will register the specified volume so that you can use it with dev environments, tasks, and services.
type: volume
# The name of the volume
name: my-volume
# Volumes are bound to a specific backend and region
backend: aws
region: eu-central-1
# The ID of the volume in AWS
volume_id: vol1235
Filesystem
If you register an existing volume, you must ensure the volume already has a filesystem.
Reference
For all volume configuration options, refer to the reference.
Create, register, or update a volume¶
To create or register the volume, pass the volume configuration to dstack apply
:
$ dstack apply -f volume.dstack.yml
Volume my-volume does not exist yet. Create the volume? [y/n]: y
NAME BACKEND REGION STATUS CREATED
my-volume aws eu-central-1 submitted now
Once created, the volume can be attached to dev environments, tasks, and services.
When creating a network volume,
dstack
automatically creates anext4
filesystem on it.
Attach a volume¶
Dev environments, tasks, and services let you attach any number of network volumes.
To attach a network volume, simply specify its name using the volumes
property
and specify where to mount its contents:
type: dev-environment
# A name of the dev environment
name: vscode-vol
ide: vscode
# Map the name of the volume to any path
volumes:
- name: my-volume
path: /volume_data
# You can also use the short syntax in the `name:path` form
# volumes:
# - my-volume:/volume_data
Once you run this configuration, the contents of the volume will be attached to /volume_data
inside the dev environment,
and its contents will persist across runs.
Attach volumes across regions and backends
If you're unsure in advance which region or backend you'd like to use (or which is available), you can specify multiple volumes for the same path.
volumes:
- name: [my-aws-eu-west-1-volume, my-aws-us-east-1-volume]
path: /volume_data
dstack
will attach one of the volumes based on the region and backend of the run.
Container path
When you're running a dev environment, task, or service with dstack
, it automatically mounts the project folder contents
to /workflow
(and sets that as the current working directory). Right now, dstack
doesn't allow you to
attach volumes to /workflow
or any of its subdirectories.
Manage volumes¶
List volumes¶
The dstack volume list
command lists created and registered volumes:
$ dstack volume list
NAME BACKEND REGION STATUS CREATED
my-volume aws eu-central-1 active 3 weeks ago
Delete volumes¶
When the volume isn't attached to any active dev environment, task, or service,
you can delete it by passing the volume configuration to dstack delete
:
$ dstack delete -f vol.dstack.yaml
Alternatively, you can delete a volume by passing the volume name to dstack volume delete
.
If the volume was created using dstack
, it will be physically destroyed along with the data.
If you've registered an existing volume, it will be de-registered with dstack
but will keep the data.
FAQs¶
Can I use network volumes across backends?
Since volumes are backed up by cloud network disks, you can only use them within the same cloud. If you need to access data across different backends, you should either use object storage or replicate the data across multiple volumes.
Can I use network volumes across regions?
Typically, network volumes are associated with specific regions, so you can't use them in other regions. Often, volumes are also linked to availability zones, but some providers support volumes that can be used across different availability zones within the same region.
If you don't want to limit a run to one particular region, you can create different volumes for different regions and specify them for the same mount point as documented above.
Can I attach network volumes to multiple runs or instances?
You can mount a volume in multiple runs. This feature is currently supported only by the runpod
backend.
Instance volumes¶
Instance volumes allow mapping any directory on the instance where the run is executed to any path inside the container. This means that the data in instance volumes is persisted only if the run is executed on the same instance.
Attach a volume¶
A run can configure any number of instance volumes. To attach an instance volume,
specify the instance_path
and path
in the volumes
property:
type: dev-environment
# A name of the dev environment
name: vscode-vol
ide: vscode
# Map the instance path to any container path
volumes:
- instance_path: /mnt/volume
path: /volume_data
# You can also use the short syntax in the `instance_path:path` form
# volumes:
# - /mnt/volume:/volume_data
Since persistence isn't guaranteed (instances may be interrupted or runs may occur on different instances), use instance volumes only for caching or with directories manually mounted to network storage.
Instance volumes are currently supported for all backends except
runpod
,vastai
andkubernetes
, and can also be used with SSH fleets.
Use instance volumes for caching¶
For example, if a run regularly installs packages with pip install
,
you can mount the /root/.cache/pip
folder inside the container to a folder on the instance for
reuse.
type: task
volumes:
- /dstack-cache/pip:/root/.cache/pip
Use instance volumes with SSH fleets¶
If you control the instances (e.g. they are on-prem servers configured via SSH fleets), you can mount network storage (e.g., NFS or SMB) and use the mount points as instance volumes.
For example, if you mount a network storage to /mnt/nfs-storage
on all hosts of your SSH fleet,
you can map this directory via instance volumes and be sure the data is persisted.
type: task
volumes:
- /mnt/nfs-storage:/storage