Skip to content

Releases

Supporting GPU provisioning and orchestration on Nebius

As demand for GPU compute continues to scale, open-source tools tailored for AI workloads are becoming critical to developer velocity and efficiency. dstack is an open-source orchestrator purpose-built for AI infrastructure—offering a lightweight, container-native alternative to Kubernetes and Slurm.

Today, we’re announcing native integration with Nebius , offering a streamlined developer experience for teams using GPUs for AI workloads.

Built-in UI for monitoring essential GPU metrics

AI workloads generate vast amounts of metrics, making it essential to have efficient monitoring tools. While our recent update introduced the ability to export available metrics to Prometheus for maximum flexibility, there are times when users need to quickly access essential metrics without the need to switch to an external tool.

Previously, we introduced a CLI command that allows users to view essential GPU metrics for both NVIDIA and AMD hardware. Now, with this latest update, we’re excited to announce the addition of a built-in dashboard within the dstack control plane.

Supporting MPI and NCCL/RCCL tests

As AI models grow in complexity, efficient orchestration tools become increasingly important. Fleets introduced by dstack last year streamline task execution on both cloud and on-prem clusters, whether it's pre-training, fine-tuning, or batch processing.

The strength of dstack lies in its flexibility. Users can leverage distributed framework like torchrun, accelerate, or others. dstack handles node provisioning, job execution, and automatically propagates system environment variables—such as DSTACK_NODE_RANK, DSTACK_MASTER_NODE_IP, DSTACK_GPUS_PER_NODE and others—to containers.

One use case dstack hasn’t supported until now is MPI, as it requires a scheduled environment or direct SSH connections between containers. Since mpirun is essential for running NCCL/RCCL tests—crucial for large-scale cluster usage—we’ve added support for it.

Exporting GPU, cost, and other metrics to Prometheus

Effective AI infrastructure management requires full visibility into compute performance and costs. AI researchers need detailed insights into container- and GPU-level performance, while managers rely on cost metrics to track resource usage across projects.

While dstack provides key metrics through its UI and dstack metrics CLI, teams often need more granular data and prefer using their own monitoring tools. To support this, we’ve introduced a new endpoint that allows real-time exporting all collected metrics—covering fleets and runs—directly to Prometheus.

Accessing dev environments with Cursor

Dev environments enable seamless provisioning of remote instances with the necessary GPU resources, automatic repository fetching, and streamlined access via SSH or a preferred desktop IDE.

Previously, support was limited to VS Code. However, as developers rely on a variety of desktop IDEs, we’ve expanded compatibility. With this update, dev environments now offer effortless access for users of Cursor .

Supporting Intel Gaudi AI accelerators with SSH fleets

At dstack, our goal is to make AI container orchestration simpler and fully vendor-agnostic. That’s why we support not just leading cloud providers and on-prem environments but also a wide range of accelerators.

With our latest release, we’re adding support for Intel Gaudi AI Accelerator and launching a new partnership with Intel.

Auto-shutdown for inactive dev environments—no idle GPUs

Whether you’re using cloud or on-prem compute, you may want to test your code before launching a training task or deploying a service. dstack’s dev environments make this easy by setting up a remote machine, cloning your repository, and configuring your IDE —all within a container that has GPU access.

One issue with dev environments is forgetting to stop them or closing your laptop, leaving the GPU idle and costly. With our latest update, dstack now detects inactive environments and automatically shuts them down, saving you money.

Introducing GPU blocks and proxy jump for SSH fleets

Recent breakthroughs in open-source AI have made AI infrastructure accessible beyond public clouds, driving demand for running AI workloads in on-premises data centers and private clouds. This shift offers organizations both high-performant clusters and flexibility and control.

However, Kubernetes, while a popular choice for traditional deployments, is often too complex and low-level to address the needs of AI teams.

Originally, dstack was focused on public clouds. With the new release, dstack extends support to data centers and private clouds, offering a simpler, AI-native solution that replaces Kubernetes and Slurm.

Supporting NVIDIA and AMD accelerators on Vultr

As demand for AI infrastructure grows, the need for efficient, vendor-neutral orchestration tools is becoming increasingly important. At dstack, we’re committed to redefining AI container orchestration by prioritizing an AI-native, open-source-first approach. Today, we’re excited to share a new integration and partnership with Vultr .

This new integration enables Vultr customers to train and deploy models on both AMD and NVIDIA GPUs with greater flexibility and efficiency–using dstack.