Deploying NVIDIA Dynamo PD disaggregation with dstack
dstack is an open-source, AI-native orchestrator that works across clouds, Kubernetes clusters, on-prem fleets, hardware vendors, and frameworks. Alongside training, inference is one of the primary use cases dstack supports out of the box.
With the latest update, dstack added native support for NVIDIA Dynamo with Prefill-Decode (PD) disaggregation, letting a service run a Dynamo router, prefill workers, and decode workers as separate replica groups.









