Orchestrate AI workloads in any cloud
dstack is an open-source orchestration engine for running AI workloads at scale in any cloud or data center, leveraging the open-source ecosystem.
Dev environments
Before submitting a task or deploying a model, you may want to run code interactively. Dev environments allow you to do exactly that.
You specify the required environment and resources, then run it. dstack provisions the dev environment in the cloud and enables access via your desktop IDE.
Tasks
Tasks allow for convenient scheduling of any kind of batch jobs, such as training, fine-tuning, or data processing, as well as running web applications.
Specify the environment and resources, then run it. dstack executes the task in the cloud, enabling port forwarding to your local machine for convenient access.
Services
Services make it very easy to deploy any kind of model or web application as public endpoints.
Use any serving frameworks and specify required resources. dstack deploys it in the configured backend, handles authentication, auto-scaling, and provides an OpenAI-compatible interface if needed.
Pools
You can have instances provisioned in the cloud automatically, or add them manually, configuring the required resources, idle duration, etc.
Pools simplify managing the lifecycle of cloud instances and enable their efficient reuse across runs.
Examples
Mixtral 8x7B
Deploy Mixtral 8x7B as a service using vLLM, an open-source serving library.
QLoRA
Fine-tune Llama 2 on a custom dataset using the PEFT and TRL, open-source training libraries.
vLLM
Deploy LLMs with vLLM, an open-source serving library.
Text Generation Inference
Deploy LLMs using TGI, an open-source serving library.
Ollama
Deploy LLMs using Ollama, an open-source serving tool.
Text Embeddings Inference
Deploy a text embeddings model using TEI, an open-source serving library.