Kubernetes¶
The kubernetes backend enables dstack
to run dev environments, tasks, and services directly on existing Kubernetes clusters.
If your GPUs are already deployed on Kubernetes and your team relies on its ecosystem and tooling, use this backend to integrate dstack
with your clusters.
If Kubernetes is not required, you can run
dstack
on clouds or on-prem clusters without Kubernetes by using VM-based, container-based, or on-prem backends.
Setting up the backend¶
To use the kubernetes
backend with dstack
, you need to configure it with the path to the kubeconfig file, the IP address of any node in the cluster, and the port that dstack
will use for proxying SSH traffic.
This configuration is defined in the ~/.dstack/server/config.yml
file:
projects:
- name: main
backends:
- type: kubernetes
kubeconfig:
filename: ~/.kube/config
proxy_jump:
hostname: 204.12.171.137
port: 32000
Proxy jump¶
To allow the dstack
server and CLI to access runs via SSH, dstack
requires a node that acts as a jump host to proxy SSH traffic into containers.
To configure this node, specify hostname
and port
under the proxy_jump
property:
hostname
— the IP address of any cluster node selected as the jump host. Both thedstack
server and CLI must be able to reach it. This node can be either a GPU node or a CPU-only node — it makes no difference.port
— any accessible port on that node, whichdstack
uses to forward SSH traffic.
No additional setup is required — dstack
configures and manages the proxy automatically.
NVIDIA GPU Operator¶
For
dstack
to correctly detect GPUs in your Kubernetes cluster, the cluster must have the NVIDIA GPU Operator pre-installed.
After the backend is set up, you interact with dstack
just as you would with other backends or SSH fleets. You can run dev environments, tasks, and services.
Fleets¶
Clusters¶
If you’d like to run distributed tasks with the kubernetes
backend, you first need to create a fleet with placement
set to cluster
:
type: fleet
# The name is optional; if not specified, one is generated automatically
name: my-k8s-fleet
# For `kubernetes`, `min` should be set to `0` since it can't pre-provision VMs.
# Optionally, you can set the maximum number of nodes to limit scaling.
nodes: 0..
placement: cluster
backends: [kuberenetes]
resources:
# Specify requirements to filter nodes
gpu: 1..8
Then, create the fleet using the dstack apply
command:
$ dstack apply -f examples/misc/fleets/.dstack.yml
Provisioning...
---> 100%
FLEET INSTANCE BACKEND GPU PRICE STATUS CREATED
Once the fleet is created, you can run distributed tasks. dstack
takes care of orchestration automatically.
For more details on clusters, see the corresponding guide.
Fleets with
placement
set tocluster
can be used not only for distributed tasks, but also for dev environments, single-node tasks, and services. Since Kubernetes clusters are interconnected by default, you can always setplacement
tocluster
.
Fleets
It’s generally recommended to create fleets even if you don’t plan to run distributed tasks.
FAQ¶
Is managed Kubernetes with auto-scaling supported?
Managed Kubernetes is supported. However, the kubernetes
backend can only run on pre-provisioned nodes.
Support for auto-scalable Kubernetes clusters is coming soon—you can track progress in the corresponding issue .
If on-demand provisioning is important, we recommend using VM-based backends as they already support auto-scaling.
When should I use the Kubernetes backend?
Choose the kubernetes
backend if your GPUs already run on Kubernetes and your team depends on its ecosystem and tooling.
If your priority is orchestrating cloud GPUs and Kubernetes isn’t a must, VM-based backends are a better fit thanks to their native cloud integration.
For on-prem GPUs where Kubernetes is optional, SSH fleets provide a simpler and more lightweight alternative.