Skip to content

Docker Compose

All backends except runpod, vastai and kubernetes allow to use Docker and Docker Compose inside dstack runs.

This example shows how to deploy Hugging Face Chat UI with TGI serving Llama-3.2-3B-Instruct using Docker Compose .

Prerequisites

Once dstack is installed, go ahead clone the repo, and run dstack init.

$ git clone https://github.com/dstackai/dstack
$ cd dstack
$ dstack init

Deployment

Running as a task

type: task
name: chat-ui-task

privileged: true
image: dstackai/dind
env:
  - MODEL_ID=meta-llama/Llama-3.2-3B-Instruct
  - HF_TOKEN
commands:
  - start-dockerd
  - docker compose up
ports:
  - 9000

# Use either spot or on-demand instances
spot_policy: auto

resources:
  # Required resources
  gpu: "NVIDIA:16GB.."

services:
  app:
    image: ghcr.io/huggingface/chat-ui:sha-bf0bc92
    command:
      - bash
      - -c
      - |
        echo MONGODB_URL=mongodb://db:27017 > .env.local
        echo MODELS='`[{
          "name": "${MODEL_ID?}",
          "endpoints": [{"type": "tgi", "url": "http://tgi:8000"}]
        }]`' >> .env.local
        exec ./entrypoint.sh
    ports:
      - 127.0.0.1:9000:3000
    depends_on:
      - tgi
      - db

  tgi:
    image: ghcr.io/huggingface/text-generation-inference:sha-704a58c
    volumes:
      - tgi_data:/data
    environment:
      HF_TOKEN: ${HF_TOKEN?}
      MODEL_ID: ${MODEL_ID?}
      PORT: 8000
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

  db:
    image: mongo:latest
    volumes:
      - db_data:/data/db

volumes:
  tgi_data:
  db_data:

Deploying as a service

If you'd like to deploy Chat UI as an auto-scalable and secure endpoint, use the service configuration. You can find it at examples/misc/docker-compose/service.dstack.yml

Running a configuration

To run a configuration, use the dstack apply command.

$ HF_TOKEN=...

$ dstack apply -f examples/examples/misc/docker-compose/task.dstack.yml

 #  BACKEND  REGION    RESOURCES                    SPOT  PRICE
 1  runpod   CA-MTL-1  18xCPU, 100GB, A5000:24GB    yes   $0.12
 2  runpod   EU-SE-1   18xCPU, 100GB, A5000:24GB    yes   $0.12
 3  gcp      us-west4  27xCPU, 150GB, A5000:24GB:2  yes   $0.23

Submit the run chat-ui-task? [y/n]: y

Provisioning...
---> 100%

Persisting data

To persist data between runs, create a volume and attach it to the run configuration.

type: task
name: chat-ui-task

privileged: true
image: dstackai/dind
env:
  - MODEL_ID=meta-llama/Llama-3.2-3B-Instruct
  - HF_TOKEN
commands:
  - start-dockerd
  - docker compose up
ports:
  - 9000

# Use either spot or on-demand instances
spot_policy: auto

resources:
  # Required resources
  gpu: "NVIDIA:16GB.."

volumes:
  - name: my-dind-volume
    path: /var/lib/docker

With this change, all Docker data—pulled images, containers, and crucially, volumes for database and model storage—will be persisted.

Source code

The source-code of this example can be found in examples/misc/docker-compose .

What's next?

  1. Check dev environments, tasks, services, and protips.