Orchestrate GPU workloads effortlessly on any cloud
dstack is an open-source engine that streamlines the development, training, and deployment of AI models across any cloud provider.
Experiment interactively in your IDE, terminal, or Jupyter notebooks before submitting long tasks or deploying models.
With dstack, a single command provisions the necessary cloud resources, code, and environment for your dev setup.
With dstack, running tasks such as training or fine-tuning scripts, or any other batch jobs, is incredibly easy.
Simply provide the commands, ports, and choose the Python version or a Docker image. dstack will handle the execution on configured cloud GPU provider(s) with the necessary resources.
With dstack, deploying models or any other web apps is straightforward.
Just provide commands, port, and select Python version or Docker image. dstack handles deployment on configured cloud GPU provider(s), giving you a public HTTPS endpoint.
Deploy Mixtral 8x7B as a service using vLLM, an open-source serving library.
Text Embeddings Inference
Deploy text embeddings models using Services and TEI, an open-source text embeddings toolkit by Hugging Face.
Use Llama Index and Weaviate to enhance the capabilities of LLMs with the context of your data.
Fine-tune Llama 2 on a custom dataset, with QLoRA and your own script, using Tasks.
Text Generation Inference
Deploy LLMs using Services and TGI, an open-source serving framework by Hugging Face.
Deploy LLMs with Services and vLLM,
an open-source serving library.
Get started in a minute
The open-source version allows you to run workloads using your own cloud accounts. It can be utilized via the CLI or API and enables the configuration of multiple projects and users.
dstack Sky is a fully managed service that enables you to run workloads across multiple cloud providers, guaranteeing optimal GPU pricing and availability. You don't need individual accounts with each provider – dstack Sky manages everything for you.