Kubernetes¶
The kubernetes backend enables dstack to run dev environments, tasks, and services directly on existing Kubernetes clusters.
If your GPUs are already deployed on Kubernetes and your team relies on its ecosystem and tooling, use this backend to integrate dstack with your clusters.
If Kubernetes is not required, you can run
dstackon clouds or on-prem clusters without Kubernetes by using VM-based, container-based, or on-prem backends.
Setting up the backend¶
To use the kubernetes backend with dstack, you need to configure it with the path to the kubeconfig file, the IP address of any node in the cluster, and the port that dstack will use for proxying SSH traffic.
This configuration is defined in the ~/.dstack/server/config.yml file:
projects:
- name: main
backends:
- type: kubernetes
kubeconfig:
filename: ~/.kube/config
proxy_jump:
hostname: 204.12.171.137
port: 32000
Proxy jump¶
To allow the dstack server and CLI to access runs via SSH, dstack requires a node that acts as a jump host to proxy SSH traffic into containers.
To configure this node, specify hostname and port under the proxy_jump property:
hostname— the IP address of any cluster node selected as the jump host. Both thedstackserver and CLI must be able to reach it. This node can be either a GPU node or a CPU-only node — it makes no difference.port— any accessible port on that node, whichdstackuses to forward SSH traffic.
No additional setup is required — dstack configures and manages the proxy automatically.
NVIDIA GPU Operator¶
For
dstackto correctly detect GPUs in your Kubernetes cluster, the cluster must have the NVIDIA GPU Operator pre-installed.
After the backend is set up, you interact with dstack just as you would with other backends or SSH fleets. You can run dev environments, tasks, and services.
Fleets¶
Clusters¶
If you’d like to run distributed tasks with the kubernetes backend, you first need to create a fleet with placement set to cluster:
type: fleet
# The name is optional; if not specified, one is generated automatically
name: my-k8s-fleet
# For `kubernetes`, `min` should be set to `0` since it can't pre-provision VMs.
# Optionally, you can set the maximum number of nodes to limit scaling.
nodes: 0..
placement: cluster
backends: [kubernetes]
resources:
# Specify requirements to filter nodes
gpu: 1..8
Then, create the fleet using the dstack apply command:
$ dstack apply -f examples/misc/fleets/.dstack.yml
Provisioning...
---> 100%
FLEET INSTANCE BACKEND GPU PRICE STATUS CREATED
Once the fleet is created, you can run distributed tasks. dstack takes care of orchestration automatically.
For more details on clusters, see the corresponding guide.
Fleets with
placementset toclustercan be used not only for distributed tasks, but also for dev environments, single-node tasks, and services. Since Kubernetes clusters are interconnected by default, you can always setplacementtocluster.
Fleets
It’s generally recommended to create fleets even if you don’t plan to run distributed tasks.
FAQ¶
Is managed Kubernetes with auto-scaling supported?
Managed Kubernetes is supported. However, the kubernetes backend can only run on pre-provisioned nodes.
Support for auto-scalable Kubernetes clusters is coming soon—you can track progress in the corresponding issue .
If on-demand provisioning is important, we recommend using VM-based backends as they already support auto-scaling.
When should I use the Kubernetes backend?
Choose the kubernetes backend if your GPUs already run on Kubernetes and your team depends on its ecosystem and tooling.
If your priority is orchestrating cloud GPUs and Kubernetes isn’t a must, VM-based backends are a better fit thanks to their native cloud integration.
For on-prem GPUs where Kubernetes is optional, SSH fleets provide a simpler and more lightweight alternative.